Targeted Integration And Stacking Of DNA Through Homologous Recombination

ABSTRACT

The invention provides various methods for the targeted integration and stacking of nucleotide sequences in the genome of a host plant cell using homologous recombination.

FIELD OF THE INVENTION

The present invention relates generally to transgenic plants and, moreparticularly, to site-specific integration and stacking of nucleotidesequences in the genome of a host cell through homologous recombination.

BACKGROUND

In recent years, the development of genetic engineering techniques hashad dramatic implications in the field of crop improvement. Using thesetechniques, beneficial traits can be introduced into almost any crop andimproved crops can be rapidly obtained. The use of genetic engineeringobviates the need for lengthy procedures to introduce the desired traitby conventional breeding methods.

Present plant transformation methods generally integrate a singletransgene into the host genome. Successful integration of each transgenerequires repeated confrontation of various issues, such as variabilityin transgene expression caused by different integration loci, so-called“positions effects,” and the risk of creating a mutation in the genomeupon integration of the transgene into the host. Consequently, a largenumber of transformation events must be screened and tested beforeobtaining a transgenic plant that exhibits the desired level oftransgene expression without also exhibiting abnormalities resultingfrom the inadvertent insertion of the transgene into an important locusin the host genome. Moreover, if an additional transgene is subsequentlyadded to a transgenic plant, the additional transgene likely will beintegrated into the genome at a location that is different from thelocation of the pre-existing transgene, rendering the breeding of eliteplant lines with both genes difficult and cumbersome.

An inherent problem with such single-round integration techniques isthat sequence stacking, or the successive integration of multiplenucleotide sequences at a predetermined locus in the host genome, isdifficult to accomplish. However, efficient sequence stacking isdesirable for a variety of reasons. For example, the ability to achievetargeted insertion of multiple transgenes into a host would facilitateregistration of a transgenic plant with government regulatory agencies,since the potential for random alteration of the plant's geneticmaterial would be minimized. Further, in some cases, such as theengineering of traits or metabolic pathways that involve multiple genes,for example, co-location of the transgenes would be highly desirable.Additionally, since only a limited number of selectable and scoreablemarker sequences may be available for use in transforming a given crop,the ability to re-use a marker sequence when introducing successivenucleotide sequences into the host genome would also be desirable.

SUMMARY

The present disclosure provides methods for the targeted integration andstacking of nucleotide sequences in the genome of a host cell usinghomologous recombination. A target sequence in the genome of a host celland a donor sequence introduced into the host cell each comprises ahomology sequence that permits homologous recombination to occur betweenthe target and donor sequences. In one embodiment, a homology sequenceshared by a target sequence and a donor sequence comprises at least oneintron sequence that lengthens the region of homology and therebyenhances the frequency of homologous recombination between the targetand donor sequences.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1-12B illustrate various exemplary embodiments of the invention.

FIGS. 13A-13F are schematic representations of a modified nptII genewith multiple introns, target and donor DNA constructs, endonucleases,and a FLP expression construct. FIG. 13A: A schematic representation ofa modified nptII gene with four Arabidopsis intron insertions (i.e., thenptII-intron gene sequence). “FRT” indicates a FLP recognition sequence.FIG. 13B: A positive control construct (pNOV2731) containing thefull-length nptII-intron. “Phsp80” indicates an HSP80 promoter; “BAR”indicates a Basta® herbicide resistance gene; “Tnos” indicates a nosterminator; “Pmsmas” indicates a modified SMAS promoter; “Tpal”indicates the Arabidopsis PAL1 terminator. FIG. 13C: A target DNAconstruct (pNOV2701) containing the modified nptII gene truncated at the5′-region. FIG. 13D: Donor DNA constructs (pNOV2736, pNOV2737, pNOV2755,pNOV2757) containing the nptII-intron gene truncated at different placesin the 3′-coding region. “Hpt” indicates a hygromycin phosphotransferasegene; hpt includes the Arabidopsis ubq3 promoter and terminator. FIG.13E: A yeast HO endonuclease expression cassette. FIG. 13F: A FLPrecombinase expression vector (pNOV2762) with Arabidopsis PPO(dm) as aselectable marker. PPO(dm) is under the control of its nativeArabidopsis ptx promoter.

FIGS. 14A-14C illustrate PCR screening and analysis of targeted events.FIG. 14A: A schematic representation of a target locus derived frompNOV2701, T-DNA of donor pNOV2736, a recombination product, and PCRprimers. Striped boxes represent genomic DNA sequences flanking theT-DNA insertion. FIG. 14B: PCR analysis of events targeted to apredetermined location in the genome of tobacco line T2701.06 usingPSMASFW2 and NPTR6 primers. Targeted events produce a 3.5 kb fragment.“M” indicates a DNA size marker (i.e., Lambda DNA digested with StyI(19.3, 7.7, 6.2, 3.5, 2.7, 1.9, 1.5, 0.9, 0.4 kb). Lane 1, negativecontrol, untransformed SR1 tobacco; lane 2, positive control T2731.1;lane 3, negative control, target line T2701.6; lane 4, HR-01AB.1; lane5, HR-01AB.2; lane 6, HR-01AB.3; lane 7, HR-01AC.1; lane 8, HR-01AD.1;lane 9, HR-01AD.4; lane 10, HR-01AE.1; lane 11, HR-01AE.2. FIG. 14C: PCRamplification of targeted events with primers from flanking genomic DNAsequences. “M” is a DNA ladder (10, 8, 6, 5, 4, 3, 2, 1.5, 1.0, 0.5 kb;3 kb band has the strongest signal, New England Biolab, Beverly Mass.).Lanes 1 to 5 with PDFSP1 and HYGRV1 primers; lane 1, HR-01AB.1; lane 2,HR-03AB.1, lane 3, HR-03AD.2, lane 4, HR-05AA.2; lane 5, HR-02AC.1.Lanes 6 to 9 with PDFSP1 and PALEXONV primers; lane 6, HR-01AB.1; lane7, HR-01AB.1×SR1 kanamycin resistant progeny; lane 8, HR-03AB.1×SR1kanamycin resistant progeny, lane 9, HR-03AD.2.

FIGS. 15A-15C represent a Southern blot analysis of targeted events.FIG. 15A: A schematic representation of target and donor vectors,restriction sites, and probes. FIG. 15B A blot probed with an HSP80promoter fragment. “M” is a DNA marker (Lambda DNA digested with StyI).Lanes 1-4, target line T2701.6; lanes 5-8, HR-03AD.2; lanes 9-11,HR-05AA.1; lanes 12-14, HR-05AA.2. Lanes 1, 5, 9, 12 with EcoRV; Lanes2, 6, 10, 13 with SacI; Lanes 3, 7, 11, 14 with NheI; Lanes 4 and 8 withSpeI. FIG. 15C: The same blot was stripped and re-probed with the nptIIexon 5::Pall 3′-UTR fragment.

FIGS. 16A-16B represent a PCR analysis of recombinant lines that havebeen re-transformed with a FLP expression vector. FIG. 16A: Tubq3fw andNptR3 primers were used for PCR amplification of lines obtained fromHR-03AD.2 progeny re-transformed with pNOV2762. The 1.5 kb bandindicates excision of the mSMAS promoter and part of the nptII-intronsequence. FIG. 16B: Tubq3fw and NptR2 primers are used for PCRamplification of progeny of HR-08AA32R2×SR1. The 932 bb band indicatesexcision of the mSMAS promoter and part of the nptII-intron sequence.Lane 1, recombinant HR-03AD.2 control; lanes 2-4, progeny with completeexcision of the mSMAS promoter and part of the nptII-intron sequence.

FIGS. 17A-17B illustrate a PMI-intron gene, a monocot target DNAconstruct, a donor DNA construct, and a positive control vector. FIG.17A: A schematic drawing showing a PMI-intron gene sequence, the T-DNAregion of monocot target vector pNOV5025, pAdF55, and the positivecontrol vector pNOV5026. “SRRS” indicates a site-specific recombinaserecognition sequence. “OsAct1” is a rice actin 1 promoter; “Hpt” is ahygromycin phsosphotransferase gene; “CMPS” is a Cestrum virus promoter;“ZmUbi” indicates a maize ubiquitin promoter; “GUS” is β-glucuronidasegene; “PPO” is a mutant Arabidopsis protoporphyrin oxidase gene. FIG.17B: Donor vectors pNOV5031, pNOV5045, pNOV5096, and pQD200C6.

FIG. 18 illustrates an exemplary embodiment of transgene targeting inmaize.

FIGS. 19A-19D represent restriction map and fragment sizes of targetlocus AW289B1A, T-DNA of donor vector pNOV5045, and putative doublecrossover recombinant with different probes. The change in size of eachrestriction fragment is represented in the lower portion with the sizein the target locus and recombinant indicated. The short bar under therestriction map represents the location of each probe.

FIGS. 20A-20D illustrate Southern blot analyses of a maize target plantAW289B1A and a targeted recombination event HR-18FB.1M. The blot washybridized with the following probes: FIG. 20A: the PMI-intron 3′-region(intron 4-exon 5) that is present in the target T-DNA but not in thedonor; FIG. 20B: the rice actin-1 5′-region fragment that is present inthe target T-DNA but not in the donor; FIG. 20C: the GUS 3′-probehybridizes to sequence present only in the donor; and FIG. 20D: the PPO3′-probe hybridizes to sequences present in both the target and donor.The hybridization probes were spiked with one microliter of labeledLambda DNA to show the molecular weight marker. Lane M had Lambda DNAdigested with StyI. The fragment sizes are: 23578 bps, 19324 bps, 7743bps, 6225 bps, 4254 bps, 3472 bps, 2690 bps, 1882 bps, 1489 bps, 925bps, and 421 bps. The 421 by fragment is not shown in the figures. Lanes1 to 5 include DNA from target plant AW289B1A; lanes 6-10 include DNAfrom targeted event HR-18FB.1M. The restriction enzymes used to digestDNA in each lane are: lane 1 and 6, Sad; lane 2 and 7, ScaI; lane 3 and8, KpnI; lane 4 and 9, SpeI; lane 5 and 10, HpaI.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

The following definitions are provided to enable a clear and consistentunderstanding of the specification and the claims. Unless otherwisenoted, terms are to be understood according to conventional usage bythose of ordinary skill in the relevant art. The nomenclature for DNAbases as set forth at 37 C.F.R. §1.822 as well as the standard one- andthree-letter nomenclature for amino acid residues are used throughoutthe disclosure.

A “coding sequence” is a nucleic acid sequence that can be transcribedinto RNA, such as mRNA, rRNA, tRNA, snRNA, sense RNA, or antisense RNA,within a host cell into which the coding sequence has been introduced.In the case of mRNA, for example, the mRNA then can be translated withinthe host cell to produce a protein. A “coding region” comprises a codingsequence.

“Donor,” “donor molecule,” “donor DNA,” and “donor sequence” are usedinterchangeably to refer to a desired nucleotide sequence that onewishes to recombine into a target DNA sequence using site-directedhomologous recombination. The donor sequence can include any desirednucleotide sequence, such as, for example, a gene, an expressioncassette, a promoter, a molecular marker, a selectable marker, a visiblemarker, a portion of any of these, or the like. A “donor construct” or“donor vector” contains a donor sequence.

“Endogenous,” as used herein, means “of the same origin,” i.e., derivedfrom a host cell.

An “excisable sequence” refers to a nucleotide sequence comprising atleast a portion of a marker sequence as well as at least one recombinaserecognition site. An excisable sequence is contained within a targetsequence.

“Expression” of a gene or other nucleotide sequence of interest refersto the transcription of the nucleotide sequence of interest to produce acorresponding RNA. In the case of an mRNA, the RNA may then betranslated to produce a corresponding gene product (i.e., a peptide, apolypeptide, or a protein). Gene expression is controlled or modulatedby regulatory elements, including 5′ regulatory elements, such as apromoter, for example.

“Expression cassette,” as used herein, includes a DNA sequence capableof directing expression of a particular nucleotide sequence in anappropriate host cell. An expression cassette typically comprises apromoter operably linked to a nucleotide sequence of interest, which isoperably linked to a terminator or termination signal or to sequencescontaining an RNA polyadenylation signal. The expression cassette mayalso comprise sequences that permit proper translation of the nucleotidesequence, such as a translation initiation site and a translationtermination sequence. Unique endonuclease restriction sites may also beincluded at the ends of an expression cassette to allow the cassette tobe easily inserted or removed when creating a DNA construct. Thenucleotide sequence of interest usually codes for a protein of interestbut may also code for a functional RNA of interest, for exampleantisense RNA or a nontranslated RNA that, in the sense or antisensedirection, inhibits expression of a particular gene, e.g., antisense RNAor double-stranded interference RNA. The expression cassette comprisingthe nucleotide sequence of interest may be chimeric, meaning that atleast one of its components is heterologous with respect to at least oneof its other components. The expression cassette may also be one that isnaturally occurring but has been obtained in a recombinant form usefulfor heterologous expression. Typically, however, the expression cassetteis heterologous with respect to the host, that is, the particular DNAsequence of the expression cassette does not occur naturally in the hostcell and must be introduced into the host cell or an ancestor of thehost cell by a transformation event. The expression of the nucleotidesequence in the expression cassette may be under the control of either aconstitutive promoter or an inducible promoter that initiatestranscription only when the host cell is exposed to some particularexternal stimulus. In the case of a multicellular organism, such as aplant, the promoter may also be specific to a particular tissue or organor stage of development.

A “foreign” gene or DNA sequence includes a gene or other nucleotidesequence of interest that is not normally found in the host organism butthat may be introduced by gene transfer. Foreign genes and DNA that arenot integrated into the genome are referred to as “extrachromosomal”.

The term “gene” is used broadly to include any segment of a nucleotidesequence associated with a biological function. Thus, a gene can includea coding sequence either with or without the regulatory sequences neededfor their expression. A gene can also include nonexpressed DNA segments,such as 5′ and 3′ untranslated sequences, recognition sequences forproteins, and/or termination sequences, for example. Further elementsthat may be present include, for example, introns. Some genes can betranscribed into mRNA and then translated into polypeptides (e.g.,structural genes); other genes can be transcribed into RNA (e.g., rRNAand tRNA); and other types of genes function as regulators of expression(i.e., regulatory genes).

“Gene of interest,” “sequence of interest,” and “DNA of interest” areused interchangeably and include any nucleotide sequence which, whentransferred to a plant, confers upon the plant a desired trait,characteristic, or biological function, such as, for example, virusresistance, insect resistance, resistance to other pests, diseaseresistance, herbicide tolerance, improved nutritional value, improvedperformance in an industrial process, or altered reproductivecapability, for example. A sequence of interest can be a markersequence. A sequence of interest can also encode an enzyme involved in abiochemical pathway, the expression of which alters a trait that isimportant or useful in food, feed, nutraceutical, and/or pharmaceuticalproduction.

“Genome” refers to the complete genetic material of an organism.

“Heterologous,” as used herein, means “of different natural origin,”that is, representing a non-natural state. For example, if a host cellis transformed with a gene derived from another organism, particularlyfrom another species, that gene is heterologous with respect to the hostcell and also with respect to descendants of the host cell that carrythe gene. Further, “heterologous” may also be used to refer to anucleotide sequence which is derived from a natural or original celltype and is inserted into that same natural or original cell type, butwhich is present in a non-natural state, such as, for example, in adifferent copy number, under the control of different regulatoryelements, or the like.

“Homologous recombination” refers to a reaction between any pair ofnucleotide sequences having corresponding sites containing a similarnucleotide sequence (i.e., homologous sequences) through which the twomolecules can interact (recombine) to form a new, recombinant DNAsequence. The sites of similar nucleotide sequence are each referred toherein as a “homology sequence”. Generally, the frequency of homologousrecombination increases as the length of the homology sequenceincreases. Thus, while homologous recombination can occur between twonucleotide sequences that are less than identical, the recombinationfrequency (or efficiency) declines as the divergence between the twosequences increases. Recombination may be accomplished using onehomology sequence on each of the donor and target molecules, therebygenerating a “single-crossover” recombination product. Alternatively,two homology sequences may be placed on each of the target and donornucleotide sequences. Recombination between two homology sequences onthe donor with two homology sequences on the target generates a“double-crossover” recombination product. If the homology sequences onthe donor molecule flank a sequence that is to be manipulated (e.g., asequence of interest), the double-crossover recombination with thetarget molecule will result in a recombination product wherein thesequence of interest replaces a DNA sequence that was originally betweenthe homology sequences on the target molecule. The exchange of DNAsequence between the target and donor through a double-crossoverrecombination event is termed “sequence replacement.”

To “identify” a recombination product means that the recombinationproduct is detected and distinguished from the starting target and donorsequences. There are many means of identifying a recombination product.For example, a selectable marker gene can be used, whereby site-specificintegration results in the selectable marker gene becoming operativelylinked with a promoter only in a recombination product. Alternatively, avisible marker gene can be used, whereby a gain or loss of marker geneexpression identifies a recombination product. Alternatively, a negativeselectable marker gene can be used, whereby a loss or lack of expressionof the marker gene identifies a recombination product. Additionally,molecular markers that are characteristic of the target sequence and/ordonor sequence can be used, such that the molecular marker pattern isunique for the recombination product.

“Integration” refers to the incorporation of a foreign gene or othernucleotide sequence into a host genome through covalent bonding to thehost DNA.

An “isolated” nucleic acid molecule or an isolated protein or toxin is anucleic acid molecule or protein or toxin that, by the hand of man,exists apart from its native environment and is therefore not a productof nature. An isolated nucleic acid molecule or protein or toxin mayexist in a purified form or may exist in a non-native environment, suchas, for example, a recombinant host cell or a transgenic plant.

A “marker sequence” refers to any nucleotide sequence that can be usedto differentiate a transformed cell from a nontransformed cell. Markersequences include, but are not limited to, selectable markers, scoreablemarkers, and molecular markers. Exemplary marker sequences includeantibiotic resistance genes (such as, e.g., those conferring resistanceto tetracycline, ampicillin, kanamycin, neomycin, hygromycin, andspectinomycin), luminescence genes (such as, e.g., genes encodingluciferase, β-galactosidase, green fluorescence protein (GFP),β-lactamase, or choramphenicol acetyl transferase (CAT)), and genesconferring an enhanced capacity, relative to non-transformed cells, toutilize a particular compound as a nutrient, growth factor, or energysource (such as, e.g., a gene encoding phosphomannose isomerase (PMI)).

“Mega-endonuclease” refers to a rare-cutting endonuclease that iscapable of making a site-specific double-strand break in DNA at aparticular recognition sequence comprising at least about 12 base pairs.The recognition sequence may be somewhat lengthy and can be as long asabout 40 base pairs. One type of mega-endonuclease is referred to as ahoming endonuclease, which is an enzyme that is encoded by an intron oran intein (Belfort and Roberts, 1997 Nucl. Acids. Res. 25(17):3379-3388; see also, Gauthier et al., 1991 Current Genet. 19:43-47).Exemplary mega-endonucleases include, but are not limited to, I-SceI,I-CeuI, I-PpoI, I-CreI, I-DmoI, I-SceII, I-TevI, I-TevII, PI-PfuI,PI-PspI, PI-Scel, and HO, as described herein or otherwise known in theart (see, e.g., Belfort and Roberts (1997).

“Native” refers to a gene that is present in the genome of anuntransformed (e.g., a “wild-type”) cell.

“Naturally occurring” is used to describe an object that can be found innature, as distinct from being artificially produced by man. Forexample, a protein or nucleotide sequence present in an organism(including a virus), which can be isolated from a source in nature andwhich has not been intentionally modified by man in the laboratory, isnaturally occurring.

A “nucleic acid molecule,” “nucleic acid sequence,” or “nucleotidesequence” is a segment of single- or double-stranded DNA or RNA that canbe isolated from any source. In the context of the present invention,the nucleic acid molecule is preferably a segment of DNA.

“Operably linked” and “operatively linked” refers to a relationshipbetween two or more nucleotide sequences that interact physically orfunctionally. For example, a promoter or regulatory nucleotide sequenceis said to be operably linked to a nucleotide sequence that encodes anRNA or a protein if the two sequences are situated such that theregulatory nucleotide sequence will affect the expression level of thecoding or structural nucleotide sequence. A 5′ portion of a gene isoperatively or operably linked with a 3′ portion of a gene if the twoportions are situated to form a functional gene.

The term “plant”, as used herein, refers to, without limitation, wholeplants, plant organs (e.g., leaves, stems, roots, fruit, etc.), seeds,plant cells and progeny of plant cells, plant tissue, plant cell ortissue cultures, protoplasts, callus, and any groups of plant cellsorganized into structural and/or functional units. A plant “regenerated”from a plant cell means that all cells of the plant are derived fromthat plant cell. The class of plants that can be used in the methods ofthe invention is generally as broad as the class of higher plantsamenable to transformation techniques, including both monocotyledonousand dicotyledonous plants. Exemplary plants include, without limitation,Acacia, alfalfa, aneth, apple, apricot, artichoke, Arabidopsis, arugula,asparagus, avocado, banana, barley, bean, beet, blackberry, blueberry,broccoli, Brussels sprouts, cabbage, canola, cantaloupe, carrot,cassaya, cauliflower, celery, cherry, chicory, clover, cilantro, citrus,clementines, coffee, corn, cotton, cucumber, eggplant, endive, escarole,eucalyptus, fennel, figs, garlic, gourd, grape, grapefruit, hemp, honeydew, jicama, kiwifruit, lettuce, leeks, lemon, lime, mango, maize,melon, mushroom, nectarine, nut, oat, okra, onion, orange, an ornamentalplant, papaya, parsley, pea, peach, peanut, pear, pepper, persimmon,pineapple, plantain, plum, pomegranate, potato, pumpkin, quince,radicchio, radish, raspberry, rice, rye, safflower, sorghum, soybean,spinach, squash, strawberry, sugarbeet, sugarcane, sunflower, sweetpotato, sweetgum, tangerine, tea, tobacco, tomato, triticale, turf,turnip, a vine, watermelon, wheat, yams, zucchini, and woody plants suchas coniferous and deciduous trees. Once a gene of interest has beentransformed into a particular plant species, the gene may be propagatedin that species or may be moved into other varieties of the samespecies, including commercial varieties, using traditional breedingtechniques.

“Plant cell” refers to a structural and physiological unit of a plant,comprising a protoplast and a cell wall, and includes, withoutlimitation, seed suspension cultures, embryos, meristematic regions,callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen,and microspores. The plant cell may be in the form of an isolated singlecell, a cultured cell, or a part of a higher organized unit such as, forexample, plant tissue, a plant organ, or a whole plant.

“Plant cell culture” means cultures of plant units such as, for example,protoplasts, cell culture cells, cells in plant tissues, pollen, pollentubes, ovules, embryo sacs, zygotes, and embryos at various stages ofdevelopment.

“Plant material” refers to leaves, stems, roots, flowers or flowerparts, fruits, pollen, egg cells, zygotes, seeds, cuttings, cell ortissue cultures, or any other part or product of a plant.

A “plant organ” is a distinct and visibly structured and differentiatedpart of a plant, such as a root, stem, leaf, flower bud, or embryo.

“Plant tissue” as used herein means a group of plant cells, includingany tissue of a plant either in planta or in culture, organized into astructural and functional unit. The term includes, but is not limitedto, whole plants, plant organs, plant seeds, tissue culture, and anygroups of plant cells organized into structural and/or functional units.The use of this term in conjunction with, or in the absence of, anyspecific type of plant tissue, as listed above or otherwise embraced bythis definition, is not intended to be exclusive of any other type ofplant tissue.

A “promoter” is an untranslated DNA sequence that is located upstream ofa coding region, contains a binding site for RNA polymerase II, andinitiates transcription of the DNA. The promoter region may also includeother elements that act as regulators of gene expression.

A “protoplast” is an isolated plant cell without a cell wall or withonly parts of the cell wall.

“Recognition site” or “recognition sequence” refers to a DNA sequencerecognized by an enzymatic protein, such as, for example, a recombinaseor an endonuclease. In the case of a recombinase, the recognition siteor sequence is the location on the DNA at which the recombinase binds tothe DNA and cleavage and strand exchange occur.

“Recombinase” refers to any enzyme that is capable of performingsite-specific recombination of DNA. Recombinase enzymes possessendonuclease and ligase activities. A recombinase may work as a singleprotein or as a complex of proteins.

“Regulatory element” includes a nucleotide sequence that is involved inconferring upon a host cell the expression of another nucleotidesequence, such as, for example, a sequence of interest. A regulatoryelement can comprise a promoter that is operably linked to thenucleotide sequence of interest and to a termination signal. Regulatoryelements also typically encompass sequences useful for propertranslation of the nucleotide sequence of interest.

“Selectable marker” or “selectable marker gene” refers to a nucleotidesequence whose expression in a plant cell gives the cell a selectiveadvantage under particular conditions. The selective advantage possessedby the cell transformed with the selectable marker gene can be animproved ability to grow in the presence of a negative selective agent,such as an antibiotic or an herbicide, for example, as compared to theability of non-transformed cells. Alternatively, the selective advantagepossessed by the transformed cells can be an enhanced capacity, relativeto non-transformed cells, to utilize a particular compound (such as aparticular carbohydrate source like mannose, for example) as a nutrient,growth factor, or energy source, thereby effecting what is termed“positive selection.” Alternatively, the selective advantage possessedby the transformed cell can be the loss of a previously possessed traitor characteristic, effecting what is termed “negative selection” or“counter selection.” In this last case, the host cell is exposed to orcontacted by a compound that is toxic only to cells that have not lostthe ability to express a specific trait or characteristic (such as anegative selectable marker gene, for example) that was present in theparent cell, which is typically a transgenic parent cell.

“Site-directed recombination,” as used herein, refers to a recombinationof two nucleotide sequences, wherein the recombination occurs betweenparticular recognition sites located on each of the nucleotidesequences.

“Site-specific” means at a particular nucleotide sequence, which can bein a specific location in the genome of a host cell. The nucleotidesequence can be endogenous to the host cell, either in its naturallocation in the host genome or at some other location in the genome, orit can be a heterologous nucleotide sequence, which has been previouslyinserted into the genome of the host cell by any of a variety of knownmethods.

“Stably transformed” refers to a host cell that contains a nucleotidesequence of interest that has been integrated into the host cell genomeand is capable of being passed to progeny of that host cell.

“Subcellular organelles” includes intracellular organs of characteristicstructure and function. Subcellular organelles include, for example,vacuoles, plastids, mitochondria, the cell nucleus, the endoplasmicreticulum, and the plasma membrane.

“Substantially identical,” as used in the context of two nucleic acid orprotein sequences, refers to two or more sequences or subsequences thathave at least 60%, preferably 80%, more preferably 90%, even morepreferably 95%, and most preferably at least 99% nucleotide or aminoacid residue identity, when compared and aligned for maximumcorrespondence, as measured using one of the following sequencecomparison algorithms or by visual inspection. In one embodiment, thesubstantial identity exists over a region of nucleotide sequences thatis at least about 50 residues in length, more preferably over a regionof at least about 100 residues, and most preferably the nucleotidesequences are substantially identical over at least about 150 residues.In one embodiment, the nucleotide sequences are substantially identicalover the entire length of their coding regions. In another embodiment,the substantial identity exists over a region of protein sequences thatis at least about 15 residues in length, more preferably over a regionof at least about 30 residues, and most preferably the protein sequencesare substantially identical over at least about 50 residues.Furthermore, substantially identical nucleic acid or protein sequencesperform substantially the same function.

For sequence comparison, typically one sequence acts as a referencesequence to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based upon thedesignated program parameters.

Optimal alignment of compared sequences can be conducted, e.g., by thelocal homology algorithm of Smith & Waterman, Adv. Appl. Math. 2: 482(1981), by the homology alignment algorithm of Needleman & Wunsch, J.Mol. Biol. 48: 443 (1970), by the search for similarity method ofPearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85: 2444 (1988), bycomputerized implementations of these algorithms (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by visual inspection (seegenerally, Ausubel et al., infra).

One example of an algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described in Altschul et al., J. Mol. Biol. 215: 403-410 (1990).Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold (Altschul et al., 1990). These initial neighborhoodword hits act as seeds for initiating searches to find longer HSPscontaining them. The word hits are then extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Cumulative scores are calculated using, for nucleotidesequences, the parameters M (reward score for a pair of matchingresidues; always >0) and N (penalty score for mismatching residues;always <0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when the cumulative alignment score falls off bythe quantity X from its maximum achieved value, the cumulative scoregoes to zero or below due to the accumulation of one or morenegative-scoring residue alignments, or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a word length (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=−4, and a comparison ofboth strands. For amino acid sequences, the BLASTP program uses asdefaults a word length (W) of 3, an expectation (E) of 10, and theBLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci.USA 89: 10915 (1989)).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA90: 5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a test nucleicacid sequence is considered to be similar to a reference sequence if thesmallest sum probability in a comparison of the test nucleic acidsequence to the reference nucleic acid sequence is less than about 0.1,more preferably less than about 0.01, and most preferably less thanabout 0.001.

Another indication that two nucleic acid sequences are substantiallyidentical is that the two molecules hybridize to each other understringent conditions. The phrase “hybridizing specifically to” refers tothe binding, duplexing, or hybridizing of a molecule only to aparticular nucleotide sequence under stringent conditions when thatsequence is present in a complex mixture (e.g., total cellular) of DNAor RNA. “Bind(s) substantially” refers to complementary hybridizationbetween a probe nucleic acid and a target nucleic acid and embracesminor mismatches that can be accommodated by reducing the stringency ofthe hybridization media to achieve the desired detection of the targetnucleic acid sequence.

“Stringent hybridization conditions” and “stringent hybridization washconditions,” in the context of nucleic acid hybridization experimentssuch as Southern and Northern hybridizations, are sequence dependent andare different under different environmental parameters. Longer sequenceshybridize specifically at higher temperatures. An extensive guide to thehybridization of nucleic acids is found in Tijssen (1993) LaboratoryTechniques in Biochemistry and Molecular Biology-Hybridization withNucleic Acid Probes, part I, chapter 2, “Overview of principles ofhybridization and the strategy of nucleic acid probe assays,” Elsevier,N.Y. Generally, highly stringent hybridization and wash conditions areselected to be about 5° C. lower than the thermal melting point (T_(m))for the specific sequence at a defined ionic strength and pH. Typically,under “stringent conditions” a probe will hybridize to its targetsubsequence, but to no other sequences.

The “T_(m)” is the temperature (under defined ionic strength and pH) atwhich 50% of the target sequence hybridizes to a perfectly matchedprobe. Very stringent conditions are selected to be equal to the T_(m)for a particular probe. An example of stringent hybridization conditionsfor hybridization of complementary nucleic acids which have more than100 complementary residues on a filter in a Southern or Northern blot is50% formamide with 1 mg of heparin at 42° C., with the hybridizationbeing carried out overnight. An example of highly stringent washconditions is 0.1 5M NaCl at 72° C. for about 15 minutes. An example ofstringent wash conditions is a 0.2×SSC wash at 65° C. for 15 minutes(see, Sambrook, infra, for a description of SSC buffer). Often, a highstringency wash is preceded by a low stringency wash to removebackground probe signal. An exemplary medium stringency wash for aduplex of, e.g., more than 100 nucleotides, is 1×SSC at 45° C. for 15minutes. An exemplary low stringency wash for a duplex of, e.g., morethan 100 nucleotides, is 4-6×SSC at 40° C. for 15 minutes. For shortprobes (e.g., about 10 to 50 nucleotides), stringent conditionstypically involve salt concentrations of less than about 1.0 M Na ion,typically about 0.01 to 1.0 M Na ion concentration (or other salts) atpH 7.0 to 8.3, and the temperature is typically at least about 30° C.Stringent conditions can also be achieved with the addition ofdestabilizing agents such as formamide. In general, a signal to noiseratio of 2× (or higher) than that observed for an unrelated probe in theparticular hybridization assay indicates detection of a specifichybridization. Nucleic acids that do not hybridize to each other understringent conditions are still substantially identical if the proteinsthat they encode are substantially identical. This occurs, e.g., when acopy of a nucleic acid is created using the maximum codon degeneracypermitted by the genetic code.

The following are examples of sets of hybridization/wash conditions thatmay be used to clone homologous nucleotide sequences that aresubstantially identical to reference nucleotide sequences of the presentinvention: a reference nucleotide sequence preferably hybridizes to thereference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 MNaPO₄, 1 mM EDTA at 50° C. with washing in 2×SSC, 0.1% SDS at 50° C.,more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mMEDTA at 50° C. with washing in 1×SSC, 0.1% SDS at 50° C., more desirablystill in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50°C. with washing in 0.5×SSC, 0.1% SDS at 50° C., preferably in 7% sodiumdodecyl sulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in0.1×SSC, 0.1% SDS at 50° C., more preferably in 7% sodium dodecylsulfate (SDS), 0.5 M NaPO₄, 1 mM EDTA at 50° C. with washing in 0.1×SSC,0.1% SDS at 65° C.

A further indication that two nucleic acid sequences or proteins aresubstantially identical is that the protein encoded by the first nucleicacid is immunologically cross reactive with, or specifically binds to,the protein encoded by the second nucleic acid. Thus, a protein istypically substantially identical to a second protein, for example,where the two proteins differ only by conservative substitutions.

“Target,” “target molecule,” “target DNA,” and “target sequence” areused interchangeably to refer to a nucleotide sequence that is presentnaturally in the genome or that has been previously introduced into achromosome of a host cell and can be inherited stably as part of thegenome (i.e., “chromosomally integrated”). The target nucleotidesequence may be a sequence of interest, an expression cassette, apromoter, a molecular marker, a marker sequence, a selectable marker, aportion of any of these, or the like. The target sequence can be stablytransformed into a plant cell to create a “target line” comprising thetarget sequence integrated at a particular chromosomal location in theplant genome. A “target construct” or “target vector” contains a targetsequence.

A “targeted integration event” or “targeted event” is usedinterchangeably with an “HR-mediated recombination product” to refer toa recombination product formed by target and donor DNA sequences throughhomologous recombination (i.e., HR).

“Transformation” is a process for introducing a nucleotide sequence intoa host cell or organism. In particular, “transformation” means thestable integration of a DNA molecule into the genome of a cell or anorganism of interest.

“Transformed,” “transgenic,” or “recombinant” refers to a cell, tissue,organ, or organism, such as a bacterium or a plant, into which aparticular nucleic acid molecule, such as a recombinant vector, has beenintroduced. The nucleic acid molecule can be stably integrated into thegenome of the recipient cell, tissue, organ, or organism and can also bepresent as an extra-chromosomal or episomal molecule. Such anextra-chromosomal molecule can be auto-replicating. Transformed ortransgenic cells, tissues, organs, or organisms are understood toencompass not only the end product of a transformation process but alsothe progeny thereof, which includes progeny produced from a breedingprogram employing a transgenic plant as a parent in a cross andexhibiting an altered genotype resulting from the presence of aheterologous nucleic acid molecule. A “non-transformed,”“non-transgenic,” or “non-recombinant” host refers to an organism, e.g.,a bacterium or plant, which does not contain the particular nucleic acidmolecule.

A “visible marker,” “screenable marker,” or “scoreable marker” refers toa gene or nucleotide sequence whose expression in a transformed cell maynot confer an advantage to that cell but can be made visible orotherwise detectable. Examples of visible markers include, but are notlimited to, β-glucuronidase (GUS), luciferase (LUC), and fluorescentproteins (such as green fluorescent protein (GFP) or cyan fluorescentprotein (CFP), for example).

The present disclosure relates to the targeted integration and stackingof nucleotide sequences within the genome of a host cell usinghomologous recombination. In one embodiment, a homology sequence sharedby a target sequence and a donor sequence comprises at least one intronsequence that lengthens the region of homology and thereby enhances thefrequency of homologous recombination between the target and donorsequences. In another embodiment, the homology sequence shared by thetarget and donor sequences comprises two or more intron sequences thatlengthen the region of homology shared between the target and donor. Ina further embodiment, a site-specific recombination system can be usedto mediate the modification of a chromosomally integrated targetsequence to prepare the target site for insertion of a subsequent donorsequence. In yet another embodiment, an endonuclease can be used toenhance recombination frequency and to facilitate introduction of thedonor sequence into the host cell's genome at the target site. In afurther embodiment, the expression level of at least one RecQ genepresent in the genome of the host cell is down-regulated to enhancehomologous recombination activity in the host cell. In still anotherembodiment, the expression level of at least one recombination-relatedgene present in the genome of the host cell is up-regulated to enhancehomologous recombination activity in the host cell.

In one embodiment, a method for targeted nucleotide sequence stacking isprovided, the method comprising: (a) providing a host cell comprising achromosomally integrated target sequence, the target sequence comprisinga truncated sequence comprising a homology sequence, the homologysequence comprising at least one intron sequence; (b) introducing intothe host cell a donor sequence comprising a sequence of interest and acompletion sequence, the completion sequence comprising the homologysequence; and (c) obtaining in the host cell a recombination productcomprising the sequence of interest and a functional sequence, thefunctional sequence comprising the homology sequence (FIG. 1). Inanother embodiment, the target sequence further comprises amega-endonuclease recognition sequence, and the method furthercomprises, prior to obtaining the recombination product, introducinginto the host cell a mega-endonuclease or a mega-endonuclease codingsequence, the mega-endonuclease or an expression product of themega-endonuclease coding sequence being capable of recognizing themega-endonuclease recognition sequence. Optionally, either of theseembodiments may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, any of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence and (ii) a truncatedsequence comprising a second homology sequence, the second homologysequence comprising at least one intron sequence; (b) introducing intothe host cell a donor sequence comprising the first homology sequence, asequence of interest, and a completion sequence, the completion sequencecomprising the second homology sequence; and (c) obtaining in the hostcell a recombination product comprising the first homology sequence, thesequence of interest, and a functional sequence, the functional sequencecomprising the second homology sequence (FIG. 2). Optionally, thisembodiment may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, either of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence and (ii) a truncatedsequence comprising a second homology sequence, the second homologysequence comprising two or more intron sequences; (b) introducing intothe host cell a donor sequence comprising the first homology sequence, asequence of interest, and a completion sequence, the completion sequencecomprising the second homology sequence; and (c) obtaining in the hostcell a recombination product comprising the first homology sequence, thesequence of interest, and a functional sequence, the functional sequencecomprising the second homology sequence (FIG. 3). In another embodiment,the target sequence further comprises a mega-endonuclease recognitionsequence positioned between the first homology sequence and thetruncated sequence, and the method further comprises, prior to obtainingthe recombination product, introducing into the host cell amega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the mega-endonucleaserecognition sequence. Optionally, either of these embodiments may beused in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, any of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence, (ii) a truncatedsequence comprising a second homology sequence, the second homologysequence comprising at least one intron sequence, and (iii) amega-endonuclease recognition sequence positioned between the firsthomology sequence and the truncated sequence; (b) introducing into thehost cell a donor sequence comprising the first homology sequence, asequence of interest, and a completion sequence, the completion sequencecomprising the second homology sequence; (c) introducing into the hostcell a mega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the mega-endonucleaserecognition sequence; and (d) obtaining in the host cell a recombinationproduct comprising the first homology sequence, the sequence ofinterest, and a functional sequence, the functional sequence comprisingthe second homology sequence; wherein (b) and (c) can be performed inany order or simultaneously (FIG. 4). Optionally, this embodiment may beused in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, either of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence comprising a firstsequence of interest and (ii) a truncated sequence comprising a secondhomology sequence, the second homology sequence comprising at least oneintron sequence; (b) introducing into the host cell a donor sequencecomprising the first homology sequence, a second sequence of interest,and a completion sequence, the completion sequence comprising the secondhomology sequence; and (c) obtaining in the host cell a recombinationproduct comprising the first homology sequence, the second sequence ofinterest, and a functional sequence, the functional sequence comprisingthe second homology sequence (FIG. 5). In another embodiment, the targetsequence further comprises a mega-endonuclease recognition sequencepositioned between the first homology sequence and the truncatedsequence, and the method further comprises, prior to obtaining therecombination product, introducing into the host cell amega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the mega-endonucleaserecognition sequence. Optionally, either of these embodiments may beused in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, any of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In accordance with the methods described herein, a target nucleotidesequence is introduced into a host plant cell. In one embodiment, thetarget sequence is chromosomally integrated into the plant genome bytransformation methods described herein or by methods otherwise known inthe art. A plant or plant cell transformed with the target sequence maybe used to obtain a target cell line or plant line. Such a target cellline or plant line may comprise a single copy of the target sequenceintegrated into its genome. Once such a target line has been obtainedand identified, it may be further characterized. For example, thelocation of the target sequence can be precisely determined by geneticmethods well known in the art or by using molecular markers, such asrestriction fragment length polymorphism (RFLP), amplified fragmentlength polymorphism (AFLP), simple sequence repeat (SSR), and the like.Additionally, host-plant DNA flanking the site of insertion may besequenced to ensure that no essential gene has been mutated or otherwisedisrupted by the insertion of the target sequence. Once awell-characterized target line is obtained, it may be used as arecipient of one or more subsequently introduced nucleotide sequences.Such additional sequences can be comprised in a donor sequence and canbe introduced into the target line by any suitable transformationmethod, including, but not limited to, Agrobacterium-mediatedtransformation, biolistic bombardment, electroporation, PEG-mediatedtransformation, and whiskers technology, as described herein orotherwise known in the art.

The target sequence comprises a target homology sequence that is used toeffect homologous recombination between the target sequence and a donorsequence that comprises a corresponding donor homology sequence.Absolute limits for the length of the homology sequence or the degree ofhomology are not fixed. Rather, the desired length of the homologysequence and/or the degree of homology depends upon the frequency and/orefficiency that is sought for a particular application. Generally, thelonger the homology sequence and the greater the degree of homology, thegreater the recombination frequency between the target and donorsequences.

In one embodiment, the homology sequence contained within each of thetarget and donor sequences can be any nucleotide sequence that is atleast about 200 base pairs in length. The length of the homologysequence can vary and includes unit integral values in the ranges ofabout 150-300 bp, 200-400 bp, 250-500 bp, 300-600 bp, 350-700 bp,400-800 bp, 450-900 bp, 500-1000 bp, 600-1250 bp, 700-1500 bp, 800-1750bp, 900-2000 bp, 1-2.5 kb, 1.5-3 kb, 2-4 kb, 2.5-5 kb, 3-6 kb, 3.5-7 kb,4-8 kb, and 5-10 kb or more. These exemplary ranges include bothendpoints as well as every integer within the range; for example, therange of 1-2.5 kb includes both 1000 by and 2500 by as well as everyinteger between those endpoints (i.e., 1000 bp, 1001 bp, 1002 bp, . . ., 2498 bp, 2499 bp, and 2500 bp).

In another embodiment, the homology sequence includes at least oneintron sequence that serves to extend the region of homology sharedbetween the target and donor sequences and thereby enhances targetingefficiency. Any suitable intron sequence can be employed in accordancewith the various embodiment of the invention, so long as the intronsequence is capable of being spliced by the host cell from the RNAtranscript(s) of a recombination product. As will be appreciated bythose of skill in the art, the intron splicing junctions must beproperly recognized by the host cell in order to produce an appropriateexpression product. Generally, an intron derived from a monocotyledonousplant will tend to be more effectively spliced from an RNA transcriptproduced by a monocotyledonous host cell. Likewise, an intron derivedfrom a dicotyledonous plant will tend to be more effectively splicedfrom an RNA transcript produced by a dicotyledonous host cell.

In one embodiment, each intron sequence is at least about 50 base pairsin length. The length of the intron sequence can vary and includes unitintegral values in the ranges of about 40-100 bp, 80-150 bp, 120-200 bp,160-250 bp, 200-300 bp, 240-350 bp, 280-400 bp, 320-450 bp, 360-500 bp,400-600 bp, 450-700 bp, 500-800 bp, 550-900 bp, 600-1000 bp, 700-1250bp, 800-1500 bp, 900-1750 bp, 1-2 kb, and 1.5-3 kb or more. Theseexemplary ranges include both endpoints as well as every integer withinthe range; for example, the range of 1.5-3 kb includes both 1500 by and3000 by as well as every integer between those endpoints (i.e., 1500 bp,1501 bp, 1502 bp, . . . , 2998 bp, 2999 bp, and 3000 bp).

In a further embodiment, each homology sequence comprises two or moreintron sequences, and each intron sequence is separated from anotherintron sequence by at least one exon sequence.

In another embodiment, each homology sequence includes at least onerecombinase recognition site (as described in greater detail below).

In one embodiment, each of the target and donor sequences comprises twohomology sequences. In this embodiment, each of the two homologysequences is independently selected. That is, the first and secondhomology sequences can be the same or they can be different from eachother. In one embodiment, at least one of the first and second homologysequences comprises a sequence of interest.

In one embodiment, the target homology sequence is contained within atruncated, and therefore inactive, nucleotide sequence. The truncatedsequence can be, for example, a truncated sequence of interest, atruncated gene, a truncated selectable marker, a truncated visiblemarker, a truncated negative selectable marker, a truncated promotersequence, a truncated expression cassette, or the like. In thisembodiment, a donor sequence is constructed to include a completionsequence that contains the donor homology sequence. The donor completionsequence completes the truncated nucleotide sequence, in that homologousrecombination between the target, which includes the truncated sequence,and the donor, which includes the completion sequence, produces afunctional or complete sequence. For example, the truncated sequence caninclude a 5′ portion (or, alternatively, a 3′ portion) of a sequence ofinterest, which optionally may be operably linked to a promotersequence. The corresponding donor completion sequence then includes the3′ portion (or, alternatively, the 5′ portion) of the sequence ofinterest and optionally may also include a termination sequence. In thismanner, homologous recombination between the target and donor sequencesligates or otherwise operably links the 5′ portion of the sequence ofinterest with the 3′ portion of the sequence of interest to reconstitutea functional or complete sequence of interest in the recombinationproduct. Only a host cell comprising a desired recombination product hasthe appropriate expression product (i.e., as derived from a functionalsequence of interest).

In one embodiment, the target's truncated sequence can be a truncatedmarker sequence. The truncated marker sequence can include either a 5′portion of a marker sequence or a 3′ portion of a marker sequence. Inone embodiment, the truncated marker sequence includes a 5′ portion of amarker sequence, which can be operably linked to a promoter sequence.The corresponding donor completion sequence includes the 3′ portion ofthe marker sequence and can also include a termination sequence. In thismanner, homologous recombination between the target and donor sequencesligates or otherwise operably links the 5′ portion of the markersequence with the 3′ portion of the marker sequence to reconstitute afunctional marker sequence in the recombination product. In anotherembodiment, the truncated marker sequence includes a 3′ portion of amarker sequence and can also include a termination sequence. Thecorresponding donor completion sequence includes the 5′ portion of themarker sequence, which can be operably linked to a promoter sequence.Homologous recombination between the target and donor sequences ligatesor otherwise operably links the 5′ portion of the marker sequence withthe 3′ portion of the marker sequence to reconstitute a functionalmarker sequence in the recombination product.

In one embodiment, the target sequence comprises a mega-endonucleaserecognition sequence. Exemplary mega-endonuclease recognition sequencesinclude those sequences that are recognized and cleaved by variousendonucleases, such as, for example, I-SceI (18 by recognition sequence,i.e., 5′-TAGGGATAA CAGGGTAAT-3′, where the arrow indicates the cleavagesite), I-CeuI (26 by recognition sequence, i.e., 5′-TAACTATAACGGTCCTAAGGTAGCGA-3′), I-PpoI (15 by recognition sequence, i.e., 5′-CTCTCTTAAGGTAGC-3′), PI-PspI (30 by recognition sequence, i.e.,5′-TGGCAAACAGCTATTAT GGGTATTATGGGT-3′), PI-SceI (39 by recognitionsequence, i.e., 5′-ATC TAT GTC GGG TGC GGA GAA AGA GGT AAT GAA ATGGCA-3′), and HO (20 by recognition sequence, i.e., 5′-CAG CTT TCC GCAACA GTA TA-3′). Other mega-endonuclease recognition sequences may alsobe used, such as any sequence recognized by I-CreI, I-DmoI, I-SceII,I-TevI, I-TevII, PI-PfuI, or any sequence recognized by othermega-endonucleases that are known in the art. See, e.g., Belfort andRoberts, p. 3382, Table 3.

As will be appreciated by those of skill in the art, mega-endonucleasesdo not have stringent recognition sequences. The above recognitionsequences are but single examples of the recognition sequences that maybe used with each of the indicated mega-endonucleases. Other recognitionsequences, such as, for example, degenerate variations of the sequencesindicated above, may also be used, including recognition sequenceshaving single or multiple base changes. See, e.g., Argast et al. 1998 J.Mol. Biol. 280: 345-353; and Gimble and Wang 1996 J. Mol. Biol. 256:163-180.

A mega-endonuclease or a sequence encoding a mega-endonuclease can beintroduced into the host plant cell prior to, after, or simultaneouslywith the introduction of the donor sequence. In one embodiment, amega-endonuclease is introduced into the host cell as a nucleic acidmolecule (DNA and/or RNA) that comprises a coding sequence for themega-endonuclease. The mega-endonuclease can be introduced as anexpression cassette comprising the coding sequence operatively linked toa plant expressible promoter and an appropriate termination sequence. Asused herein, “plant expressible” means that the promoter is operablewithin a plant cell and is therefore capable of driving expression of anucleotide sequence to which the promoter is operably linked within theplant cell. The promoter may be selected such that expression of themega-endonuclease can be spatially or temporally regulated in anydesired manner. For example, a promoter can be selected such thatexpression of the mega-endonuclease is constitutive, developmentallyregulated, tissue specific, tissue preferred, cell specific, specific toa particular cellular compartment (i.e., organellar-specific), or thelike. Additionally, the promoter can be chosen so that expression of themega-endonuclease can be chemically induced in a plant, resulting inexpression of the mega-endonuclease only in response to treatment of theplant cell or tissue with a chemical ligand. By combining promoterelements that confer specific expression with those conferringchemically induced expression, the mega-endonuclease can be expressed oractivated within specific cells or tissues of the plant in response to achemical application. Any of a variety of plant expressible promoterscan be used to drive expression of the mega-endonuclease. Several ofsuch promoters are described herein, and other such promoters are knownin the art.

In another embodiment, the mega-endonuclease is introduced into theplant cell by being stably transformed into the genome of the plantcell. For example, the mega-endonuclease can be comprised in anexpression cassette comprising the coding sequence of themega-endonuclease operatively linked to a promoter capable of expressionin plant tissues and cells. Suitable methods for stably transformingplant cells are known in the art and are described herein. In oneembodiment, a plant cell that is stably transformed with themega-endonuclease is also stably transformed with a target sequence. Inanother embodiment, a plant cell that is stably transformed with themega-endonuclease is also stably transformed with a donor sequence.

As will be appreciated by one of skill in the art, a whole plant can beregenerated from a plant cell or a group of plant cells that has beenstably transformed with a selected nucleotide sequence. This regeneratedwhole plant is then also referred to as being transformed with theselected nucleotide sequence. Thus, for example, in accordance with themethods disclosed herein, a first plant can be stably transformed withone or more expression cassettes comprising a mega-endonuclease and adonor sequence, and this first plant then can be crossed with a secondplant that is stably transformed with a target sequence. Accordingly,expression of the mega-endonuclease in an F1 plant or seed canfacilitate recombination between the target and donor sequences suchthat the HR-mediated recombination product is formed in the F1 plant orseed. The nucleotide sequence encoding the mega-endonuclease and theunrecombined portion(s) of the donor sequence can then be segregatedfrom a nucleotide sequence comprising the recombination productsequence(s) through breeding.

In another embodiment, the mega-endonuclease can be introduced into aplant cell such that the plant cell transiently expresses themega-endonuclease. For example, the mega-endonuclease coding sequencecan be introduced into a plant cell through any known means for planttransformation, such as, for example, Agrobacterium or microprojectilebombardment. Frequently, the introduced nucleotide sequence is notintegrated into the genome but can be transcribed nonetheless into mRNA.

In another embodiment, the coding sequence of the mega-endonuclease issupplied to the host cell in the form of messenger RNAs (mRNA). In thismanner, the mega-endonuclease is provided to the host cell onlytransiently. The coding sequence for the mega-endonuclease can beinserted into a vector for in-vitro transcription of the RNA usingmethods described in Lebel et al. 1995 Theor. Appl. Genet. 91:899-906and U.S. Pat. No. 6,051,409. The RNA then can be transformed into a hostcell, such as a cell from a donor line or a target line, for example. Inone embodiment, the RNA can be co-transformed into a host cell with adonor sequence. In an exemplary embodiment, the RNA can be transferredto a host cell using microprojectile bombardment, as described in U.S.Pat. No. 6,051,409. In another embodiment, the RNA can be introducedinto protoplasts of a host cell by PEG-mediated transformation (see,e.g., Lebel et al. 1995 Theor. Appl. Genet. 91:899-906) or byelectroporation. In another embodiment, other transformation techniques,such as microinjection of the RNA, can be used to introduce the RNA intothe host cell.

In a further embodiment, an active mega-endonuclease can be introducedinto a host cell as a protein, such as a purified protein, for example.The mega-endonuclease protein can be introduced into the cell by anysuitable method known in the art, such as, for example, microinjectionor electroporation. In another embodiment, the mega-endonuclease can beintroduced into the host cell by microinjection together with a donorDNA sequence (see, e.g., Neuhaus et al. 1993 Cell 73:937-952). Inanother embodiment, the mega-endonuclease protein is introduced into thehost cell through infection with Agrobacterium comprising a VirE2 orVirF fusion protein (see, e.g., Vergunst et al. 2000 Science290:979-82).

In one embodiment, the coding sequence of the mega-endonuclease can beoptimized for expression in a particular plant host. It is known in theart that the expression of heterologous proteins in plants can beenhanced by optimizing the coding sequences of the proteins according tothe codon preference of the host plant. The preferred codon usage inplants differs from the preferred codon usage in certain microorganisms.A comparison of the codon usage within a cloned microbial ORF (openreading frame) to the codon usage in plant genes (and, in particular,genes from the selected host plant) enables an identification of thecodons within the ORF that can be changed in an effort to optimize thecoding sequence for expression in the host plant.

In one embodiment, the donor sequence comprises at least one sequence ofinterest. The sequence of interest may be included in an expressioncassette, and expression of the sequence of interest may be controlledby any of the promoters described herein or by any other plantexpressible promoter known in the art. The promoter that controls ordrives expression of the sequence of interest can be included in theexpression cassette that comprises the sequence of interest, or thepromoter can be otherwise operably linked to the sequence of interest.Exemplary sequences of interest include, but are not limited to,sequences encoding traits related to any of the following desirablecharacteristics: waxy starch; herbicide tolerance; resistance tobacterial, fungal, or viral disease; insect resistance; abiotic stressresistance; enhanced nutritional quality; improved performance in anindustrial process; altered reproductive capability, such as malesterility or male fertility; yield stability; yield enhancement; and theproduction of commercially valuable enzymes or metabolites in plants.

In another embodiment, the donor sequence may also include a donormarker sequence, such as a selectable or visible marker gene, forexample. The donor marker sequence can be any marker sequence describedherein or otherwise known in the art but is typically different from anymarker sequence associated with the target homology sequence. In thiscontext, “associated with the target homology sequence” means that themarker sequence, or a truncated form of the marker sequence, is part ofthe target sequence and includes the target homology sequence, such thatthe target sequence would be capable of expressing the marker sequenceupon recombination with a donor that included the correspondingcompletion sequence, as described above. In such a case, the donormarker sequence can be selected such that the donor marker sequence isdifferent from the marker sequence associated with the target homologysequence, and recombination of the target and donor results in arecombination product that includes two different marker sequences. Asdescribed herein, the donor marker sequence can be operably linked to asuitable promoter and/or a suitable termination sequence.

In another embodiment, the donor sequence can be stably integrated intoa plant genome. A plant or plant cell transformed with the donorsequence can be obtained by any suitable transformation method, asdescribed herein or by methods otherwise known in the art, and is usedto form a donor cell line or plant line. Such a donor cell line or plantline may include a single copy of the donor sequence integrated into itsgenome. Once such a donor line has been obtained and identified, it maybe further characterized, as described above with respect to the targetline.

In one embodiment, a target line can be crossed with a donor line bymethods of sexual reproduction known in the art, such as, for example,by pollinating the target line with pollen of the donor line andobtaining seed comprising both the target and donor sequences. AnHR-mediated recombination product can result from an exchange ofnucleotide sequences between a target sequence locus and a donorsequence locus.

In accordance with another aspect of the methods disclosed herein, asite-specific recombinase can be used to excise a portion of a targetsequence that has been introduced into a host cell prior to introducinga donor sequence into that host cell. Exemplary site-specificrecombinases (and corresponding recognition sites) include, but are notlimited to, FLP (FRT), Cre (Lox), R(RS), Gin (gix), β (six), anintegrase from any of bacteriophage-λ, HK022, φC31, or R4 (and theircorresponding attB/attP or attL/attR sites), as well as any of severalother recombinases that are known in the art (see, e.g., Nunes-Duby etal. 1998 Nucleic Acid Research 26:391-406; Smith and Thorpe 2002Molecular Microbiology 44:299-307).

In accordance with another aspect of the methods disclosed herein,recombinase recognition sites and a corresponding site-specificrecombinase can be used to modify an HR-recombination product inpreparation for a successive round of targeted sequence integration andstacking.

In one embodiment, a method for targeted nucleotide sequence stacking isprovided, the method comprising: (a) providing a host cell comprising achromosomally integrated target sequence, the target sequence comprisinga truncated sequence comprising a homology sequence, the homologysequence comprising at least one intron sequence; (b) introducing intothe host cell a donor sequence comprising (i) a sequence of interest,(ii) a completion sequence comprising a first recombinase recognitionsite and the homology sequence, and (iii) a second recombinaserecognition site positioned between the sequence of interest and thecompletion sequence; and (c) obtaining in the host cell a recombinationproduct comprising the sequence of interest, the second recombinaserecognition site, and a functional sequence, the functional sequencecomprising the first recombinase recognition site and the homologysequence; wherein the first and second recombinase recognition sites canbe the same or different (FIG. 6). In another embodiment, the targetsequence further comprises a first mega-endonuclease recognitionsequence, the donor sequence further comprises a secondmega-endonuclease recognition sequence positioned between the sequenceof interest and the completion sequence, and the method furthercomprises, prior to obtaining the recombination product, introducinginto the host cell a mega-endonuclease or a mega-endonuclease codingsequence, the mega-endonuclease or an expression product of themega-endonuclease coding sequence being capable of recognizing the firstmega-endonuclease recognition sequence. Optionally, either of theseembodiments may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, any of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence and (ii) a truncatedsequence comprising a second homology sequence, the second homologysequence comprising at least one intron sequence; (b) introducing intothe host cell a donor sequence comprising (i) the first homologysequence, (ii) a sequence of interest, (iii) a completion sequencecomprising a first recombinase recognition site and the second homologysequence, and (iv) a second recombinase recognition site positionedbetween the sequence of interest and the completion sequence; and (c)obtaining in the host cell a recombination product comprising the firsthomology sequence, the sequence of interest, the second recombinaserecognition site, and a functional sequence, the functional sequencecomprising the first recombinase recognition site and the secondhomology sequence; wherein the first and second recombinase recognitionsites can be the same or different (FIG. 7A). In another embodiment, thetarget sequence further comprises a first mega-endonuclease recognitionsequence positioned between the first homology sequence and thetruncated sequence, the donor sequence further comprises a secondmega-endonuclease recognition sequence positioned between the sequenceof interest and the completion sequence, and the method furthercomprises, prior to obtaining the recombination product, introducinginto the host cell a mega-endonuclease or a mega-endonuclease codingsequence, the mega-endonuclease or an expression product of themega-endonuclease coding sequence being capable of recognizing the firstmega-endonuclease recognition sequence. Optionally, either of theseembodiments may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, any of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence and (ii) a truncatedsequence comprising a second homology sequence, the second homologysequence comprising two or more intron sequences; (b) introducing intothe host cell a donor sequence comprising (i) the first homologysequence, (ii) a sequence of interest, (iii) a completion sequencecomprising a first recombinase recognition site and the second homologysequence, and (iv) a second recombinase recognition site positionedbetween the sequence of interest and the completion sequence; and (c)obtaining in the host cell a recombination product comprising the firsthomology sequence, the sequence of interest, the second recombinaserecognition site, and a functional sequence, the functional sequencecomprising the first recombinase recognition site and the secondhomology sequence; wherein the first and second recombinase recognitionsites can be the same or different (FIG. 7B). In another embodiment, thetarget sequence further comprises a first mega-endonuclease recognitionsequence positioned between the first homology sequence and thetruncated sequence, the donor sequence further comprises a secondmega-endonuclease recognition sequence positioned between the sequenceof interest and the completion sequence, and the method furthercomprises, prior to obtaining the recombination product, introducinginto the host cell a mega-endonuclease or a mega-endonuclease codingsequence, the mega-endonuclease or an expression product of themega-endonuclease coding sequence being capable of recognizing the firstmega-endonuclease recognition sequence. Optionally, either of theseembodiments may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, any of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence comprising a firstsequence of interest and (ii) a truncated sequence comprising a secondhomology sequence, the second homology sequence comprising at least oneintron sequence; (b) introducing into the host cell a donor sequencecomprising (i) the first homology sequence, (ii) a second sequence ofinterest, (iii) a completion sequence comprising a first recombinaserecognition site and the second homology sequence, and (iv) a secondrecombinase recognition site positioned between the second sequence ofinterest and the completion sequence; and (c) obtaining in the host cella recombination product comprising the first homology sequence, thesecond sequence of interest, the second recombinase recognition site,and a functional sequence, the functional sequence comprising the firstrecombinase recognition site and the second homology sequence; whereinthe first and second recombinase recognition sites can be the same ordifferent (FIG. 8). In another embodiment, the target sequence furthercomprises a first mega-endonuclease recognition sequence positionedbetween the first homology sequence and the truncated sequence, thedonor sequence further comprises a second mega-endonuclease recognitionsequence positioned between the second sequence of interest and thecompletion sequence, and the method further comprises, prior toobtaining the recombination product, introducing into the host cell amega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the first mega-endonucleaserecognition sequence. Optionally, either of these embodiments may beused in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, any of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In one embodiment, a method for targeted nucleotide sequence stacking isprovided, the method comprising: (a) providing a host cell comprising achromosomally integrated target sequence, the target sequence comprising(i) a first homology sequence comprising a first sequence of interest,(ii) a truncated sequence comprising a second homology sequence, thesecond homology sequence comprising at least one intron sequence, and(iii) a first mega-endonuclease recognition sequence positioned betweenthe first homology sequence and the truncated sequence; (b) introducinginto the host cell a donor sequence comprising (i) the first homologysequence, (ii) a second sequence of interest, (iii) a completionsequence comprising a first recombinase recognition site and the secondhomology sequence, (iv) a second mega-endonuclease recognition sequencepositioned between the second sequence of interest and the completionsequence, and (v) a second recombinase recognition site positionedbetween the second mega-endonuclease recognition sequence and thecompletion sequence; (c) introducing into the host cell amega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the first mega-endonucleaserecognition sequence; (d) obtaining in the host cell a recombinationproduct comprising the first homology sequence, the second sequence ofinterest, the second recombinase recognition site, and a functionalsequence, the functional sequence comprising the first recombinaserecognition site and the second homology sequence; (e) introducing intothe host cell a recombinase or a recombinase coding sequence, therecombinase or an expression product of the recombinase coding sequencebeing capable of recognizing the first and second recombinaserecognition sites; and (f) obtaining in the host cell a recombinationproduct comprising the first homology sequence, the second sequence ofinterest, the second mega-endonuclease recognition sequence, and atruncated sequence comprising a third recombinase recognition site andthe second homology sequence; wherein the first and second recombinaserecognition sites can be the same or different; wherein the second andthird recombinase recognition sites can be the same or different; andwherein (b) and (c) may be performed in any order or simultaneously(FIG. 9A). Optionally, this embodiment may be used in conjunction with amethod for down-regulating the expression level of at least one RecQgene that is present in the genome of the host cell. Optionally, eitherof these embodiments may be used in conjunction with a method forup-regulating the expression level of at least one recombination-relatedgene that is present in the genome of the host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence comprising a firstsequence of interest, (ii) a truncated sequence comprising a secondhomology sequence, the second homology sequence comprising at least oneintron sequence, and (iii) a first mega-endonuclease recognitionsequence positioned between the first homology sequence and thetruncated sequence; (b) introducing into the host cell a donor sequencecomprising (i) the first homology sequence, (ii) a second sequence ofinterest, (iii) a first completion sequence comprising a firstrecombinase recognition site and the second homology sequence, (iv) asecond mega-endonuclease recognition sequence positioned between thesecond sequence of interest and the completion sequence, and (v) asecond recombinase recognition site positioned between the secondmega-endonuclease recognition sequence and the completion sequence; (c)introducing into the host cell a mega-endonuclease or amega-endonuclease coding sequence, the mega-endonuclease or anexpression product of the mega-endonuclease coding sequence beingcapable of recognizing the first mega-endonuclease recognition sequence;(d) obtaining in the host cell a recombination product comprising thefirst homology sequence, the second sequence of interest, the secondmega-endonuclease recognition sequence, the second recombinaserecognition site, and a functional sequence, the functional sequencecomprising the first recombinase recognition site and the secondhomology sequence; (e) introducing into the host cell a recombinase or arecombinase coding sequence, the recombinase or an expression product ofthe recombinase coding sequence being capable of recognizing the firstand second recombinase recognition sites; (f) obtaining in the host cella recombination product comprising the first homology sequence, thesecond sequence of interest, the second mega-endonuclease recognitionsequence, and a truncated sequence comprising a third recombinaserecognition site and the second homology sequence; (g) introducing intothe host cell a second donor sequence comprising (i) a third homologysequence comprising the second sequence of interest, (ii) a thirdsequence of interest, (iii) a second completion sequence comprising afourth recombinase recognition site and the second homology sequence,(iv) a third mega-endonuclease recognition sequence positioned betweenthe third sequence of interest and the second completion sequence, and(v) a fifth recombinase recognition site positioned between the thirdmega-endonuclease recognition sequence and the second completionsequence; (h) introducing into the host cell a second mega-endonucleaseor a second mega-endonuclease coding sequence, the secondmega-endonuclease or an expression product of the secondmega-endonuclease coding sequence being capable of recognizing thesecond mega-endonuclease recognition sequence; (i) obtaining in the hostcell a recombination product comprising the first sequence of interest,the third homology sequence comprising the second sequence of interest,the third sequence of interest, the third mega-endonuclease recognitionsequence, the fifth recombinase recognition site, and a functionalsequence comprising the fourth recombinase recognition site and thesecond homology sequence; wherein the first and third mega-endonucleaserecognition sequences may be the same or different; wherein the firstand second recombinase recognition sites can be the same or different;wherein the second and third recombinase recognition sites can be thesame or different; wherein the third and fifth recombinase recognitionsites can be the same or different; wherein the fourth and fifthrecombinase recognition sites can be the same or different; wherein (b)and (c) may be performed in any order or simultaneously; and wherein (g)and (h) may be performed in any order or simultaneously. As will bereadily appreciated by one skilled in the art, steps (e) through (h) maybe repeated as desired to obtain a host cell comprising multiplesequences of interest (FIGS. 9A and 9B). Optionally, this embodiment maybe used in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, either of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In accordance with another aspect of the methods disclosed herein, atarget sequence that has been chromosomally integrated into the hostcell genome can include a first recombinase recognition site and afunctional sequence, such as a marker sequence, for example. Thefunctional sequence can comprise a target homology sequence thatincludes a second recombinase recognition site. The portion of thetarget sequence that is positioned between the first and secondrecombination recognition sequences is an excisable sequence, which canbe removed by a suitable recombinase that is introduced into the celland is capable of recognizing the first and second recognitionsequences. Removal of the exisable sequence by the recombinasetransforms the functional sequence into a truncated sequence.

In one embodiment, a method for preparing a target sequence for targetedintegration and stacking is provided, the method comprising: (a)providing a host cell comprising a chromosomally integrated targetsequence, the target sequence comprising (i) a functional sequencecomprising a homology sequence, the homology sequence comprising a firstrecombinase recognition site and at least one intron sequence and (ii) asecond recombinase recognition site positioned upstream (i.e., to the 5′side) of the functional sequence; (b) introducing into the host cell arecombinase or a recombinase coding sequence, the recombinase or anexpression product of the recombinase coding sequence being capable ofrecognizing the first and second recombinase recognition sites; and (c)obtaining in the host cell a recombination product comprising atruncated sequence comprising a third recombinase recognition site andthe homology sequence; wherein the first and second recombinaserecognition sites can be the same or different; and wherein the secondand third recombinase recognition sites can be the same or different(FIG. 10). Optionally, the target sequence may further comprise amega-endonuclease recognition sequence positioned upstream of the secondrecombinase recognition site.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence, (ii) a functionalsequence comprising a second homology sequence, the second homologysequence comprising a first recombinase recognition site and at leastone intron sequence, and (iii) a second recombinase recognition sitepositioned between the first homology sequence and the functionalsequence; (b) introducing into the host cell a recombinase or arecombinase coding sequence, the recombinase or an expression product ofthe recombinase coding sequence being capable of recognizing the firstand second recombinase recognition sites; (c) obtaining in the host cella recombination product comprising the first homology sequence and atruncated sequence comprising a third recombinase recognition site andthe second homology sequence; (d) introducing into the host cell a donorsequence comprising the first homology sequence, a sequence of interest,and a completion sequence, the completion sequence comprising the secondhomology sequence; and (e) obtaining in the host cell a recombinationproduct comprising the first homology sequence, the sequence ofinterest, and the functional sequence comprising the second homologysequence; wherein the first and second recombinase recognition sites canbe the same or different; and wherein the second and third recombinaserecognition sites can be the same or different (FIG. 11). In anotherembodiment, the target sequence further comprises a mega-endonucleaserecognition sequence positioned between the first homology sequence andthe functional sequence, and the method further comprises, any timeafter step (c) and prior to step (e), introducing into the host cell amega-endonuclease or a mega-endonuclease coding sequence, themega-endonuclease or an expression product of the mega-endonucleasecoding sequence being capable of recognizing the mega-endonucleaserecognition sequence. Optionally, either of these embodiments may beused in conjunction with a method for down-regulating the expressionlevel of at least one RecQ gene that is present in the genome of thehost cell. Optionally, any of these embodiments may be used inconjunction with a method for up-regulating the expression level of atleast one recombination-related gene that is present in the genome ofthe host cell.

In another embodiment, a method for targeted nucleotide sequencestacking is provided, the method comprising: (a) providing a host cellcomprising a chromosomally integrated target sequence, the targetsequence comprising (i) a first homology sequence, (ii) a functionalsequence comprising a second homology sequence, the second homologysequence comprising a first recombinase recognition site and at leastone intron sequence, and (iii) a second recombinase recognition sitepositioned between the first homology sequence and the functionalsequence; (b) introducing into the host cell a recombinase or arecombinase coding sequence, the recombinase or an expression product ofthe recombinase coding sequence being capable of recognizing the firstand second recombinase recognition sites; (c) obtaining in the host cella recombination product comprising the first homology sequence and atruncated sequence comprising a third recombinase recognition site andthe second homology sequence; (d) introducing into the host cell a donorsequence comprising (i) the first homology sequence, (ii) a sequence ofinterest, (iii) a completion sequence, the completion sequencecomprising a fourth recombinase recognition site and the second homologysequence, and (iii) a fifth recombinase recognition site positionedbetween the sequence of interest and the completion sequence; (e)obtaining in the host cell a recombination product comprising the firsthomology sequence, the sequence of interest, the fifth recombinaserecognition site, and a functional sequence comprising the fourthrecombinase recognition site and the second homology sequence; (f)introducing into the host cell the recombinase or the recombinase codingsequence; and (g) obtaining in the host cell a recombination productcomprising the first homology sequence, the sequence of interest, and atruncated sequence comprising a sixth recombinase recognition site andthe second homology sequence; wherein the first and second recombinaserecognition sites can be the same or different; wherein the second andthird recombinase recognition sites can be the same or different;wherein the third and fifth recombinase recognition sites can be thesame or different; wherein the fourth and fifth recombinase recognitionsites can be the same or different; and wherein the fifth and sixthrecombinase recognition sites can be the same or different. Steps (d)through (f) may be repeated as desired, as detailed in a previouslydescribed embodiment, to obtain a host cell comprising multiplesequences of interest (FIGS. 12A-12B). In another embodiment, the targetsequence further comprises a first mega-endonuclease recognitionsequence positioned between the first homology sequence and thefunctional sequence, the donor sequence further comprises a secondmega-endonuclease recognition sequence positioned between the sequenceof interest and the completion sequence, and the method furthercomprises, any time after step (c) and prior to step (e), introducinginto the host cell a mega-endonuclease or a mega-endonuclease codingsequence, the mega-endonuclease or an expression product of themega-endonuclease coding sequence being capable of recognizing the firstmega-endonuclease recognition sequence. Optionally, either of theseembodiments may be used in conjunction with a method for down-regulatingthe expression level of at least one RecQ gene that is present in thegenome of the host cell. Optionally, any of these embodiments may beused in conjunction with a method for up-regulating the expression levelof at least one recombination-related gene that is present in the genomeof the host cell.

In one embodiment, the recombinase can be introduced into the host cellas one or more nucleic acid molecules (DNA and/or RNA) that comprise thecoding sequence for each constituent protein of the recombinase. Therecombinase can be introduced as one or more expression cassettescomprising a coding region for each constituent protein, wherein eachcoding region is operatively linked to a promoter capable of expressionin plant cells. Promoters for each expression cassette can be selectedsuch that expression of the recombinase can be spatially or temporallyregulated in any desired manner. For example, a promoter can be selectedsuch that expression of the recombinase is constitutive, developmentallyregulated, tissue specific, tissue preferred, cell specific, specific toa particular cellular compartment (i.e., organellar-specific), or thelike. Additionally, promoters can be chosen so that expression of therecombinase can be chemically induced in a plant, resulting inexpression of the recombinase only in response to treatment of the plantcell or tissue with a chemical ligand. By combining promoter elementsthat confer specific expression with those conferring chemically inducedexpression, the recombinase can be expressed or activated withinspecific cells or tissues of the plant in response to a chemicalapplication. Any of a variety of plant expressible promoters can be usedto drive expression of the recombinase. Several of such promoters aredescribed herein, and others of such promoters are known in the art.

In another embodiment, the recombinase can be introduced into the plantcell by being stably transformed into the genome of the plant cell. Forexample, the recombinase can be comprised in one or more expressioncassettes comprising the coding sequences of the recombinase, wherebythe coding sequence for each protein component of the recombinase isoperatively linked to a promoter capable of expression in plant tissuesand cells. Suitable methods for stably transforming plant cells areknown in the art and are described herein. In one embodiment, a plantcell that is stably transformed with the recombinase is also stablytransformed with a donor sequence.

In one embodiment, the recombinase can be introduced into a plant cellsuch that the plant cell transiently expresses the recombinase. Forexample, one or more nucleotide sequences comprising the recombinasecoding sequence can be introduced into a plant cell throughAgrobacterium or microprojectile bombardment, for example. Much of theintroduced nucleotide sequences are not integrated into the genome butcan be transcribed into mRNA.

In another embodiment, the coding sequence(s) of the recombinase can besupplied to the host cell in the form of messenger RNAs (mRNA). In thismanner, the recombinase can be provided to the host cell onlytransiently. The coding sequence for each of the proteins of therecombinase can be inserted into a vector for in-vitro transcription ofthe RNA using methods described in Lebel et al. 1995 Theor. Appl. Genet.91:899-906 and U.S. Pat. No. 6,051,409. The RNA then can be transformedinto a host cell, such as a cell from a donor line or a target line, forexample. In one embodiment, the RNA is co-transformed into a host cellwith a donor sequence. In an exemplary embodiment, the RNA istransferred to a host cell using microprojectile bombardment, asdescribed in U.S. Pat. No. 6,051,409. In another embodiment, the RNA isintroduced into protoplasts of a host cell by PEG-mediatedtransformation (see, e.g., Lebel et al. 1995 Theor. Appl. Genet.91:899-906) or by electroporation. In another embodiment, othertransformation techniques, such as microinjection of the RNA, are usedto introduce the RNA into the host cell.

In a further embodiment, an active recombinase can be introduced into ahost cell as one or more proteins, such as one or more purifiedproteins, for example. The recombinase protein can be introduced intothe cell by any suitable method known in the art, such as, for example,microinjection or electroporation. In another embodiment, therecombinase is introduced into the host cell by microinjection togetherwith a donor DNA sequence (see, e.g., Neuhaus et al. 1993 Cell73:937-952). In another embodiment, the recombinase protein isintroduced into the host cell through infection with Agrobacteriumcomprising a VirE2 or VirF fusion protein (see, e.g., Vergunst et al.2000 Science 290:979-82).

In one embodiment, the coding sequence(s) of the recombinase can beoptimized for expression in a particular plant host. It is known in theart that the expression of heterologous proteins in plants can beenhanced by optimizing the coding sequences of the proteins according tothe codon preference of the host plant. The preferred codon usage inplants differs from the preferred codon usage in certain microorganisms.A comparison of the codon usage within a cloned microbial ORF (openreading frame) to the codon usage in plant genes (and, in particular,genes from the selected host plant) enables an identification of thecodons within the ORF that can be changed in an effort to optimize thecoding sequence for expression in the host plant.

General Methods and Components

Nucleotide sequences utilized in accordance with various embodiments ofthe invention can be incorporated into a host cell using conventionalrecombinant DNA technology. Generally, this involves using standardcloning procedures known in the art to insert a nucleotide sequence intoan expression system, such as a vector, for example, with respect towhich the nucleotide sequence is heterologous. The vector may containadditional elements that may be used during transcription and/ortranslation of the inserted coding sequence by the host cell thatcontains the vector. A large number of vector systems known in the artcan be used, such as plasmids, bacteriophage viruses, other modifiedviruses, and the like. The components of the expression system may alsobe modified to increase expression levels of the inserted codingsequence. For example, truncated sequences, nucleotide substitutions, orother modifications may be employed. Expression systems known in the artcan be used to transform virtually any crop plant cell under suitableconditions. Transformed cells may then be regenerated into whole plants.Methods for transforming dicots and monocots are known to those skilledin the art, as described below.

I. Expression Cassettes

Coding sequences intended for expression in transgenic plants are firstassembled in expression cassettes 3′ to a suitable promoter expressiblein plants. The expression cassettes can also comprise any furthersequences needed or selected for the expression of the transgene. Suchsequences include, but are not restricted to, transcription terminators,extraneous sequences to enhance expression such as introns, viralsequences, and sequences intended for the targeting of the gene productto specific organelles and cell compartments. These expression cassettescan then be transferred to the plant transformation vectors describedherein.

The following is a description of various components of typicalexpression cassettes.

A. Promoters

Selection of the promoter to be used in expression cassettes willdetermine the spatial and temporal expression pattern of the transgenein the transgenic plant. Selected promoters will express transgenes inspecific cell types (such as leaf epidermal cells, mesophyll cells, rootcortex cells) or in specific tissues or organs (roots, leaves orflowers, for example) and selection should reflect the desired locationof accumulation of the gene product. Alternatively, the selectedpromoter can drive expression of the gene under various inducingconditions. Promoters vary in their strength, i.e., ability to promotetranscription. Depending upon the host cell system utilized, any one ofa number of suitable promoters can be used, including the gene's nativepromoter. The following are non-limiting examples of promoters that canbe used in the expression cassettes employed in the present invention.

1. Constitutive Promoters

a. Ubiquitin Promoters

Ubiquitin is a gene product known to accumulate in many cell types andits promoter has been cloned from several species for use in transgenicplants (e.g. sunflower—Binet et al. 1991 Plant Science 79: 87-94;maize—Christensen et al. 1989 Plant Molec. Biol. 12: 619-632; andArabidopsis—Norris et al. 1993 Plant Mol. Biol. 21:895-906). The maizeubiquitin promoter has been developed in transgenic monocot systems andits sequence and vectors constructed for monocot transformation aredisclosed in the patent publication EP 0 342 926. Taylor et al. (1993Plant Cell Rep. 12: 491-495) describe a vector (pAHC25) that comprisesthe maize ubiquitin promoter and first intron and its high activity incell suspensions of numerous monocotyledons when introduced viamicroprojectile bombardment. The Arabidopsis ubiquitin promoter may alsobe used with the nucleotide sequences of the present invention. Theubiquitin promoter is suitable for gene expression in transgenic plants,including both monocotyledons and dicotyledons. Suitable vectors includederivatives of pAHC25 or any of the transformation vectors described inthis application. The vectors can be modified by the introduction ofappropriate ubiquitin promoter and/or intron sequences.

b. The CaMV 35S Promoter

Construction of the plasmid pCGN1761 is described in published patentapplication EP 0 392 225 (Example 23). The plasmid contains the “double”CaMV 35S promoter and the tml transcriptional terminator with a uniqueEcoRI site between the promoter and the terminator and has a pUC-typebackbone. A derivative of pCGN1761 is constructed which has a modifiedpolylinker, which includes NotI and XhoI sites in addition to theexisting EcoRI site. This derivative, designated pCGN1761ENX, is usefulfor the cloning of cDNA sequences or coding sequences (includingmicrobial ORF sequences) within its polylinker for the purpose of theirexpression under the control of the 35S promoter in transgenic plants.The entire 35S promoter-coding sequence-tml terminator cassette of sucha construction can be excised by HindIII, SphI, SalI, and XbaI sites 5′to the promoter and XbaI, BamHI and BglI sites 3′ to the terminator fortransfer to transformation vectors such as those described below.Furthermore, the double 35S promoter fragment can be removed by 5′excision with HindIII, SphI, SalI, XbaI, or PstI, and 3′ excision withany of the polylinker restriction sites (EcoRI, Nod or XhoI) forreplacement with another promoter. If desired, modifications around thecloning sites can be made by the introduction of sequences that canenhance translation. This is particularly useful when over-expression isdesired. For example, pCGN1761ENX can be modified by optimization of thetranslational initiation site as described in Example 37 of U.S. Pat.No. 5,639,949.

c. The Actin Promoter

Several isoforms of actin are known to be expressed in most cell typesand consequently the actin promoter is suitable for use as aconstitutive promoter. In particular, the promoter from the rice ActIgene has been cloned and characterized (McElroy et al. 1990 Plant Cell2: 163-171). A 1.3 kb fragment of the promoter was found to contain allthe regulatory elements required for expression in rice protoplasts.Furthermore, numerous expression vectors based on the ActI promoter havebeen constructed specifically for use in monocotyledons (McElroy et al.1991 Mol. Gen. Genet. 231: 150-160). These incorporate the ActI-intron1, AdhI 5′ flanking sequence and AdhI-intron 1 (from the maize alcoholdehydrogenase gene) and sequence from the CaMV 35S promoter. Vectorsshowing highest expression were fusions of 35S and ActI intron or theActI 5′ flanking sequence and the ActI intron. Optimization of sequencesaround the initiating ATG (of the GUS reporter gene) also enhancedexpression. The promoter expression cassettes described by McElroy etal. (1991 Mol. Gen. Genet. 231: 150-160)) can be easily modified forgene expression and are particularly suitable for use inmonocotyledonous hosts. For example, promoter-containing fragments canbe removed from the McElroy constructions and used to replace the double35S promoter in pCGN1761ENX, which is then available for the insertionof specific gene sequences. The fusion genes thus constructed can thenbe transferred to appropriate transformation vectors. In a separatereport, the rice ActI promoter with its first intron has also been foundto direct high expression in cultured barley cells (Chibbar et al. 1993Plant Cell Rep. 12: 506-509).

2. Inducible Expression

a. PR-1 Promoters

The double 35S promoter in pCGN1761ENX can be replaced with any otherpromoter of choice that will result in suitably high expression levels.By way of example, one of the chemically regulatable promoters describedin U.S. Pat. No. 5,614,395, such as the tobacco PR-1a promoter, canreplace the double 35S promoter. Alternatively, the Arabidopsis PR-1promoter described in Lebel et al. 1998 Plant J. 16:223-233 can be used.The promoter of choice can be excised from its source by restrictionenzymes; alternatively, it can be PCR-amplified using primers that carryappropriate terminal restriction sites. If PCR-amplification isundertaken, then the promoter can be re-sequenced to check foramplification errors after the cloning of the amplified promoter in thetarget vector. The chemically/pathogen regulatable tobacco PR-1apromoter is cleaved from plasmid pCIB1004 (for construction, see example21 of EP 0 332 104) and transferred to plasmid pCGN1761ENX (Uknes et al.1992 Plant Cell 4: 645-656). The plasmid pCIB1004 is cleaved with NcoIand the resultant 3′ overhang of the linearized fragment is renderedblunt by treatment with T4 DNA polymerase. The fragment is then cleavedwith HindIII and the resultant PR-1a promoter-containing fragment is gelpurified and cloned into pCGN1761ENX from which the double 35S promoterhas been removed. This is done by cleavage with XhoI and blunting withT4 polymerase, followed by cleavage with HindIII and isolation of thelarger vector-terminator containing fragment into which the pCIB1004promoter fragment is cloned. This generates a pCGN1761ENX derivativewith the PR-1a promoter and the tml terminator and an interveningpolylinker with unique EcoRI and NotI sites. The selected codingsequence can be inserted into this vector, and the fusion products (i.e.promoter-gene-terminator) can subsequently be transferred to anyselected transformation vector, including those described infra. Variouschemical regulators can be employed to induce expression of the selectedcoding sequence in plants transformed in accordance with variousembodiments of the invention, including the benzothiadiazole,isonicotinic acid, and salicylic acid compounds disclosed in U.S. Pat.Nos. 5,523,311 and 5,614,395.

b. Ethanol-Inducible Promoters

A promoter inducible by certain alcohols or ketones, such as ethanol,can also be used to confer inducible expression of a coding sequence inaccordance with various embodiments of the invention. Such a promoteris, for example, the alcA gene promoter from Aspergillus nidulans(Caddick et al. 1998 Nat. Biotechnol 16:177-180). In A. nidulans, thealcA gene encodes alcohol dehydrogenase I, the expression of which isregulated by the AlcR transcription factors in presence of the chemicalinducer. For the purposes of the present disclosure, the CAT codingsequences in plasmid palcA:CAT comprising a alcA gene promoter sequencefused to a minimal 35S promoter (Caddick et al. 1998 Nat. Biotechnol16:177-180) can be replaced by a selected coding sequence to form anexpression cassette having the coding sequence under the control of thealcA gene promoter. This is carried out using methods well known in theart.

c. Glucocorticoid-Inducible Promoter

Induction of expression of a nucleic acid sequence using systems basedon steroid hormones is also contemplated. For example, aglucocorticoid-mediated induction system is used (Aoyama and Chua 1997The Plant Journal 11: 605-612) and gene expression is induced byapplication of a glucocorticoid, such as a synthetic glucocorticoid(e.g., dexamethasone). In one embodiment, the glutocorticoid is presentat a concentration ranging from about 0.1 mM to about 1 mM. In anotherembodiment, the glutocorticoid is present at a concentration rangingfrom about 10 mM to 100 mM. For the purposes of the present disclosure,the luciferase gene sequences can be replaced by a sequence of interestto form an expression cassette having a sequence of interest under thecontrol of six copies of the GAL4 upstream activating sequences fused tothe 35S minimal promoter. This is carried out using methods well knownin the art. The trans-acting factor comprises the GAL4 DNA-bindingdomain (Keegan et al. 1986 Science 231: 699-704) fused to thetransactivating domain of the herpes viral protein VP16 (Triezenberg etal. 1988 Genes Devel. 2: 718-729) fused to the hormone-binding domain ofthe rat glucocorticoid receptor (Picard et al. 1988 Cell 54: 1073-1080).The expression of the fusion protein can be controlled by any promotersuitable for expression in plants, as known in the art or describedhere. This expression cassette can also comprise a sequence of interestfused to the 6×GAL4/minimal promoter. Thus, tissue- or organ-specificityof the fusion protein can be achieved, leading to inducible tissue- ororgan-specificity of the expression cassette.

d. Wound-Inducible Promoters

Wound-inducible promoters can also be suitable for gene expression.Numerous such promoters have been described (e.g. Xu et al. 1993 PlantMolec. Biol. 22: 573-588, Logemann et al. 1989 Plant Cell 1: 151-158,Rohrmeier & Lehle 1993 Plant Molec. Biol. 22: 783-792, Firek et al. 1993Plant Molec. Biol. 22: 129-142, Warner et al. 1993 Plant J. 3: 191-201)and all are suitable for use with various embodiments of the invention.Logemann et al. describe the 5′ upstream sequences of the dicotyledonouspotato wunl gene. Xu et al. show that a wound-inducible promoter fromthe dicotyledon potato (pint) is active in the monocotyledon rice.Further, Rohrmeier & Lehle describe the cloning of the maize WipI cDNA,which is wound induced and which can be used to isolate the cognatepromoter using standard techniques. Similar, Firek et al. and Warner etal. have described a wound-induced gene from the monocotyledon Asparagusofficinalis, which is expressed at local wound and pathogen invasionsites. Using cloning techniques well known in the art, these promoterscan be transferred to suitable vectors, fused to a sequence of interest,for example, and used to express the sequence of interest at sites ofplant wounding.

3. Tissue-Specific or Tissue-Preferred Expression

a. Root-Preferred Expression

Another pattern of gene expression is root expression. A suitable rootpromoter for use with various embodiments of the invention is thepromoter of the maize metallothionein-like (MTL) gene described by deFramond (FEBS 290: 103-106 (1991)) and also in U.S. Pat. No. 5,466,785.This “MTL” promoter is transferred to a suitable vector such aspCGN1761ENX for the insertion of a selected gene and subsequent transferof the entire promoter-gene-terminator cassette to a transformationvector of interest.

b. Pith-Preferred Expression

Patent Application WO 93/07278 describes the isolation of the maize trpAgene, which is preferentially expressed in pith cells. The gene sequenceand promoter extending up to −1726 by from the start of transcriptionare presented. Using standard molecular biological techniques, thispromoter, or parts thereof, can be transferred to a vector such aspCGN1761 where it can replace the 35S promoter and be used to drive theexpression of a foreign gene in a pith-preferred manner. In fact,fragments containing the pith-preferred promoter or parts thereof can betransferred to any vector and modified for utility in transgenic plants.

c. Leaf-Specific Expression

A maize gene encoding phosphoenol carboxylase (PEPC) has been describedby Hudspeth & Grula (1989 Plant Molec Biol 12: 579-589). Using standardmolecular biological techniques the promoter for this gene can be usedto drive the expression of any gene in a leaf-specific manner intransgenic plants.

d. Pollen-Specific Expression

WO 93/07278 (published Apr. 15, 1993; Ciba Geigy) describes theisolation of the maize calcium-dependent protein kinase (CDPK) gene,which is expressed in pollen cells. The gene sequence and promoterextend up to 1400 by from the start of transcription. Using standardmolecular biological techniques, this promoter or parts thereof, can betransferred to a vector such as pCGN1761 where it can replace the 35Spromoter and be used to drive the expression of a sequence of interestin a pollen-specific manner.

B. Transcriptional Terminators

A variety of transcriptional terminators are available for use in theexpression cassettes of the present invention. These are responsible forthe termination of transcription beyond the transgene and correct mRNApolyadenylation. Suitable transcriptional terminators are those that areknown to function in plants and include, but are not limited to, theCaMV 35S terminator, the tml terminator, the nopaline synthaseterminator and the pea rbcS E9 terminator. These can be used in bothmonocotyledons and dicotyledons. In addition, a gene's nativetranscription terminator can be used.

C. Sequences for the Enhancement or Regulation of Expression

Numerous sequences have been found to enhance gene expression fromwithin the transcriptional unit, and these sequences can be used inconjunction with various genes to increase their expression intransgenic plants.

Various intron sequences have been shown to enhance expression,particularly in monocotyledonous cells. For example, the introns of themaize AdhI gene have been found to significantly enhance the expressionof the wild-type gene under its cognate promoter when introduced intomaize cells. Intron 1 was found to be particularly effective andenhanced expression in fusion constructs with the chloramphenicolacetyltransferase gene (Callis et al. 1987 Genes Develop. 1: 1183-1200).In the same experimental system, the intron from the maize bronze1 genehad a similar effect in enhancing expression. Intron sequences have beenroutinely incorporated into plant transformation vectors, typicallywithin the non-translated leader.

A number of non-translated leader sequences derived from viruses arealso known to enhance expression, and these are particularly effectivein dicotyledonous cells. Specifically, leader sequences from TobaccoMosaic Virus (TMV, the “W-sequence”), Maize Chlorotic Mottle Virus(MCMV), and Alfalfa Mosaic Virus (AMV) have been shown to be effectivein enhancing expression (e.g. Gallie et al. 1987 Nucl. Acids Res. 15:8693-8711; Skuzeski et al. 1990 Plant Molec. Biol. 15: 65-79). Otherleader sequences known in the art include but are not limited to:picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5′noncoding region) (Elroy-Stein, Fuerst, and Moss 1989 PNAS USA86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco EtchVirus) (Allison et al., 1986); MDMV leader (Maize Dwarf Mosaic Virus);Virology 154:9-20); human immunoglobulin heavy-chain binding protein(BiP) leader, (Macejak and Sarnow 1991 Nature 353: 90-94); untranslatedleader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4),(Jobling and Gehrke 1987 Nature 325:622-625; tobacco mosaic virus leader(TMV), (Gallie et al. 1989 Molecular Biology of RNA, pages 237-256); andMaize Chlorotic Mottle Virus leader (MCMV) (Lommel et al. 1991 Virology81:382-385). See also, Della-Cioppa et al. 1987 Plant Physiology84:965-968.

D. Synthetic Genes

In various embodiments of the invention, coding sequences for selectedproteins, such as a mega-endonuclease or a site-specific recombinase,for example, can be optimized for expression in a particular plant host.It is known in the art that the optimization of protein expression inplants can be enhanced by optimizing the coding regions of genes to thecodon preference of the host. Accordingly, the preferred codon usage inplants differs from the preferred codon usage in certain microorganisms.Comparison of the usage of codons within a cloned microbial ORF to usagein plant genes (and in particular genes from the target plant) enablesan identification of the codons within the ORF that can be changed.Typically, plant evolution has tended towards a strong preference of thenucleotides C and G in the third base position of monocotyledons,whereas dicotyledons often use the nucleotides A or T at this position.By modifying a gene to incorporate preferred codon usage for aparticular target transgenic species, many of the problems describedbelow for GC/AT content and illegitimate splicing will be overcome.

Plant genes typically have a GC content of more than 35%. ORF sequenceswhich are rich in A and T nucleotides can cause several problems inplants. Firstly, motifs of ATTTA are believed to cause destabilizationof message RNA (mRNA) and are found at the 3′ end of many short-livedmRNAs. Secondly, the occurrence of polyadenylation signals, such asAATAAA, at inappropriate positions within the mRNA is believed to causepremature truncation of transcription. In addition, monocotyledons mayrecognize AT-rich sequences as introns and may identify flanking splicesites (see below).

Plants differ from microorganisms in that their mRNAs do not possess adefined ribosome-binding site. Rather, it is believed that ribosomesattach to the 5′ end of the mRNA and scan for the first available ATG atwhich to start translation. Nevertheless, it is believed that there is apreference for certain nucleotides adjacent to the ATG and thatexpression of microbial genes can be achieved by the inclusion of aeukaryotic consensus translation initiator at the ATG. Clontech(1993/1994 catalog, page 210, incorporated herein by reference) havesuggested one sequence as a consensus translation initiator for theexpression of the E. coli uidA gene in plants. Further, Joshi (1987 NAR15: 6643-6653) has compared many plant sequences adjacent to the ATG andsuggests another consensus sequence. In situations where difficultiesare encountered in the expression of microbial ORFs in plants, inclusionof one of these sequences at the initiating ATG may improve translation.In such cases, the last three nucleotides of the consensus may not beappropriate for inclusion in the modified sequence due to theirmodification of the second AA residue. Preferred sequences adjacent tothe initiating methionine may differ between different plant species. Asurvey of 14 maize genes located in the GenBank database provided thefollowing results:

Position Before the Initiating ATG in 14 Maize Genes: −10 −9 −8 −7 −6 −5−4 −3 −2 −1 C 3 8 4 6 2 5 6 0 10 7 T 3 0 3 4 3 2 1 1 1 0 A 2 3 1 4 3 2 37 2 3 G 6 3 6 0 6 5 4 6 1 5This analysis can be done for the desired plant species into which thenucleotide sequence is being incorporated, and the sequence adjacent tothe ATG modified to incorporate the preferred nucleotides.

Genes cloned from non-plant sources and not optimized for expression inplants may also contain motifs which may be recognized in plants as 5′or 3′ splice sites and may be cleaved, thus generating truncated ordeleted mRNAs. These sites can be removed using techniques well known inthe art.

Techniques for modifying coding sequences and adjacent sequences arewell known in the art. In cases where the initial expression of amicrobial ORF is low and it is deemed appropriate to make alterations tothe sequence as described above, then the construction of syntheticgenes can be accomplished according to methods well known in the art.See, e.g., EP 0 385 962, EP 0 359 472, and WO 93/07278. In most cases,it is preferable to assay the expression of gene constructions usingtransient assay protocols (which are well known in the art) prior totheir use in generating transgenic plants.

II. Plant Transformation Vectors and Selectable Markers

Numerous transformation vectors known to those of ordinary skill in theplant transformation arts are available for plant transformation, andthe nucleotide sequences pertinent to the invention can be used inconjunction with any such vectors. The selection of a particular vectorwill depend upon the preferred transformation technique and the targetspecies for transformation. For certain target species, differentantibiotic or herbicide selection markers may be preferred. Selectionmarkers used routinely in transformation include the nptII gene, whichconfers resistance to kanamycin and related antibiotics (Messing &Vierra. 1982 Gene 19: 259-268; Bevan et al. 1983 Nature 304:184-187),the bar gene, which confers resistance to the herbicide phosphinothricin(White et al. 1990 Nucl. Acids Res 18: 1062, Spencer et al. 1990 Theor.Appl. Genet 79: 625-631), the hpt gene, which confers resistance to theantibiotic hygromycin (Blochinger & Diggelmann Mol Cell Biol 4:2929-2931), and the dhfr gene, which confers resistance to methotrexate(Bourouis et al. 1983 EMBO J. 2(7): 1099-1104), the EPSPS gene, whichconfers resistance to glyphosate (U.S. Pat. Nos. 4,940,835 and5,188,642), and the mannose-6-phosphate isomerase gene (also referred toherein as the phosphomannose isomerase, or PMI, gene), which providesthe ability to metabolize mannose (U.S. Pat. Nos. 5,767,378 and5,994,629).

A. Vectors Suitable for Agrobacterium Transformation

Many vectors are available for transformation using Agrobacteriumtumefaciens. These typically carry at least one T-DNA border sequenceand include vectors such as pBIN19 (Bevan Nucl. Acids Res. (1984)).Below, the construction of two typical vectors suitable forAgrobacterium transformation is described.

1. pCIB200 and pCIB2001

The binary vectors pcIB200 and pCIB2001 are used for the construction ofrecombinant vectors for use with Agrobacterium and are constructed inthe following manner. pTJS75kan is created by NarI digestion of pTJS75(Schmidhauser & Helinski 1985 J. Bacteriol. 164: 446-455) allowingexcision of the tetracycline-resistance gene, followed by insertion ofan AccI fragment from pUC4K carrying an NPTII (Messing & Vierra 1982Gene 19: 259-268; Bevan et al. 1983 Nature 304: 184-187; McBride et al.1990 Plant Molecular Biology 14: 266-276). XhoI linkers are ligated tothe EcoRV fragment of PCIB7 which contains the left and right T-DNAborders, a plant selectable nos/nptII chimeric gene and the pUCpolylinker (Rothstein et al. 1987 Gene 53: 153-161), and theXhol-digested fragment are cloned into Sail-digested pTJS75kan to createpCIB200 (see also EP 0 332 104, example 19). pCIB200 contains thefollowing unique polylinker restriction sites: EcoRI, SstI, KpnI, BglII,XbaI, and SalI. pCIB2001 is a derivative of pCIB200 created by theinsertion into the polylinker of additional restriction sites. Uniquerestriction sites in the polylinker of pCIB2001 are EcoRI, SstI, KpnI,Beg XbaI, SalI, MluI, BclI, AvrII, ApaI, HpaI, and StuI. pCIB2001, inaddition to containing these unique restriction sites, also has plantand bacterial kanamycin selection, left and right T-DNA borders forAgrobacterium-mediated transformation, the RK2-derived trfA function formobilization between E. coli and other hosts, and the OriT and OriVfunctions also from RK2. The pCIB2001 polylinker is suitable for thecloning of plant expression cassettes containing their own regulatorysignals.

2. pCIB10 and Hygromycin Selection Derivatives Thereof

The binary vector pCIB10 contains a gene encoding kanamycin resistancefor selection in plants and T-DNA right and left border sequences.pCIB10 incorporates sequences from the wide host-range plasmid pRK252allowing it to replicate in both E. coli and Agrobacterium. Itsconstruction is described by Rothstein et al. (1987 Gene 53: 153-161).Various derivatives of pCIB10 are constructed, which incorporate thegene for hygromycin B phosphotransferase described by Gritz et al. (1983Gene 25: 179-188). These derivatives enable selection of transgenicplant cells on hygromycin only (pCIB743), or hygromycin and kanamycin(pCIB715, pCIB717).

B. Vectors Suitable for Non-Agrobacterium Transformation

Transformation without the use of Agrobacterium tumefaciens circumventsthe requirement for T-DNA sequences in the chosen transformation vector,and, consequently, vectors lacking these sequences can be utilized inaddition to vectors such as the ones described above which contain T-DNAsequences. Transformation techniques that do not rely on Agrobacteriuminclude transformation via particle bombardment, protoplast uptake (e.g.PEG and electroporation), and microinjection. The choice of vectordepends largely on the selected transformation method. Below, theconstruction of typical vectors suitable for non-Agrobacteriumtransformation is described.

1. pCIB3064

pCIB3064 is a pUC-derived vector suitable for direct gene transfertechniques in combination with selection by the herbicide basta (orphosphinothricin). The plasmid pCIB246 comprises the CaMV 35S promoterin operational fusion to the E. coli GUS gene and the CaMV 35Stranscriptional terminator and is described in the published PCTapplication WO 93/07278. The 35S promoter of this vector contains twoATG sequences 5′ of the start site. These sites are mutated usingstandard PCR techniques in such a way as to remove the ATGs and generatethe restriction sites SspI and PvuII. The new restriction sites are 96and 37 by away from the unique SalI site and 101 and 42 by away from theactual start site. The resultant derivative of pCIB246 is designatedpCIB3025. The GUS gene is then excised from pCIB3025 by digestion withSalI and SacI, the termini rendered blunt and religated to generateplasmid pCIB3060. The plasmid pJIT82 may be obtained from the John InnesCentre, Norwich and the 400 by SmaI fragment containing the bar genefrom Streptomyces viridochromogenes is excised and inserted into theHpaI site of pCIB3060 (Thompson et al. 1987 EMBO J 6: 2519-2523). Thisgenerated pCIB3064, which comprises the bar gene (for herbicideselection) under the control of the CaMV 35S promoter and terminator, agene for ampicillin resistance (for selection in E. coli), and apolylinker with the unique sites SphI, PstI, HindIII, and BamHI. Thisvector is suitable for the cloning of plant expression cassettescontaining their own regulatory signals.

2. pSOG19 and pSOG35

The plasmid pSOG35 is a transformation vector that utilizes the E. coligene dihydrofolate reductase (DFR) as a selectable marker conferringresistance to methotrexate. PCR is used to amplify the 35S promoter(−800 bp), intron 6 from the maize Adh1 gene (−550 bp), and 18 by of theGUS untranslated leader sequence from pSOG10. A 250-bp fragment encodingthe E. coli dihydrofolate reductase type II gene is also amplified byPCR, and these two PCR fragments are assembled with a SacI-PstI fragmentfrom pB1221 (Clontech), which comprises the pUC19 vector backbone andthe nopaline synthase terminator. Assembly of these fragments generatespSOG19, which contains the 35S promoter in fusion with the intron 6sequence, the GUS leader, the DHFR gene, and the nopaline synthaseterminator. Replacement of the GUS leader in pSOG19 with the leadersequence from Maize Chlorotic Mottle Virus (MCMV) generates the vectorpSOG35. pSOG19 and pSOG35 carry the pUC gene for ampicillin resistanceand have HindIII, SphI, PstI and EcoRI sites available for the cloningof foreign substances.

C. Vector Suitable for Chloroplast Transformation

For expression of a nucleotide sequence in plant plastids, plastidtransformation vector pPH143 (WO 97/32011, example 36) can be used. Thenucleotide sequence is inserted into pPH143 thereby replacing the PROTOXcoding sequence. This vector is then used for plastid transformation andselection of transformants for spectinomycin resistance. Alternatively,the nucleotide sequence is inserted in pPH143 so that it replaces theaadH gene. In this case, transformants are selected for resistance toPROTOX inhibitors.

III. Transformation Methods

Target, donor, and other nucleotide sequence cassettes in accordancewith the various embodiments of the invention can be introduced into theplant cell in a number of art-recognized ways. Methods for regeneratingplants are also well known in the art. For example, Ti plasmid-derivedvectors have been utilized for the delivery of foreign DNA, as well asdirect DNA uptake, liposomes, electroporation, microinjection, andmicroprojectiles. In addition, bacteria from the genus Agrobacterium canbe utilized to transform plant cells.

Once a desired DNA sequence has been transformed into a particular plantspecies, it may be propagated in that species or moved into othervarieties of the same species, particularly including commercialvarieties, using traditional breeding techniques.

Below are descriptions of representative techniques for transformingboth dicotyledonous and monocotyledonous plants, as well as arepresentative plastid transformation technique.

A. Transformation of Dicotyledons

Transformation techniques for dicotyledons are well known in the art andinclude Agrobacterium-based techniques and techniques that do notrequire Agrobacterium. Non-Agrobacterium techniques involve the uptakeof exogenous genetic material directly by protoplasts or cells. This canbe accomplished by PEG or electroporation mediated uptake, particlebombardment-mediated delivery, or microinjection. Examples of thesetechniques are described by Paszkowski et al. 1984 EMBO J 3: 2717-2722,Potrykus et al. 1985 Mol. Gen. Genet. 199: 169-177, Reich et al. 1986Biotechnology 4: 1001-1004, and Klein et al. 1987 Nature 327: 70-73. Ineach case, the transformed cells are regenerated into whole plants usingstandard techniques known in the art.

Agrobacterium-mediated transformation is a preferred technique for thetransformation of dicotyledons because of its high transformationefficiency and its broad utility with many different species.Agrobacterium transformation typically involves the transfer of a binaryvector carrying a foreign DNA of interest (e.g., pCIB200 or pCIB2001) toan appropriate Agrobacterium strain, which may depend on the complementof vir genes carried by the host Agrobacterium strain either on aco-resident Ti plasmid or chromosomally (e.g., strain CIB542 for pCIB200and pa:132001 (Uknes et al. 1993 Plant Cell 5: 159-169). The transfer ofthe recombinant binary vector to Agrobacterium is accomplished by atriparental mating procedure using E. coli carrying the recombinantbinary vector, a helper E. coli strain which carries a plasmid such aspRK2013 and which is able to mobilize the recombinant binary vector tothe target Agrobacterium strain. Alternatively, the recombinant binaryvector can be transferred to Agrobacterium by DNA transformation (Höfgen& Willmitzer, 1988 Nucl. Acids Res. 16: 9877).

Transformation of the target plant species by recombinant Agrobacteriumusually involves co-cultivation of the Agrobacterium with explants fromthe plant and follows protocols well known in the art. Transformedtissue is regenerated on a selection medium containing the compound(e.g., the antibiotic, herbicide, or carbohydrate source) thatcorresponds to the selectable marker sequence (e.g., antibiotic orherbicide resistance gene or PMI gene) present between the binaryplasmid's T-DNA borders.

Another approach to transforming a plant cell with a gene involvespropelling inert or biologically active particles at plant tissues andcells. This technique is disclosed in U.S. Pat. Nos. 4,945,050,5,036,006, and 5,100,792, all issued to Sanford et al. Generally, thisprocedure involves propelling inert or biologically active particles atthe cells under conditions effective to penetrate the outer surface ofthe cell and afford incorporation within the interior thereof. Wheninert particles are utilized, the vector can be introduced into the cellby coating the particles with the vector containing the desired gene.Alternatively, the target cell can be surrounded by the vector so thatthe vector is carried into the cell by the wake of the particle.Biologically active particles (e.g., dried yeast cells, dried bacterium,or a bacteriophage, each containing DNA sought to be introduced) canalso be propelled into plant cell tissue.

B. Transformation of Monocotyledons

Transformation of most monocotyledon species has now also becomeroutine. Preferred techniques include direct gene transfer intoprotoplasts using PEG (polyethylene glycol) or electroporationtechniques, particle bombardment into callus tissue, and transformationmediated by Agrobacterium. Transformations can be undertaken with asingle DNA species or multiple DNA species (i.e., co-transformation),both of which are suitable for use with the methods disclosed herein.Co-transformation may have the advantage of avoiding complete vectorconstruction and of generating transgenic plants with unlinked loci forthe gene of interest and either the selectable marker or othersequences, such as those used for improving transformation efficiency,thereby enabling the removal of the selectable marker or other sequencesin subsequent generations, should this be regarded as desirable.However, a disadvantage of the use of co-transformation is the less than100% frequency with which separate DNA species are integrated into thegenome (Schocher et al. 1986 Biotechnology 4: 1093-1096).

Patent Applications EP 0 292 435, EP 0 392 225, and WO 93/07278 describetechniques for the preparation of callus and protoplasts from an eliteinbred line of maize, transformation of protoplasts using PEG orelectroporation, and the regeneration of maize plants from transformedprotoplasts. Gordon-Kamm et al. (1990 Plant Cell 2: 603-618) and Frommet al. (1990 Biotechnology 8: 833-839) have published techniques fortransformation of A188-derived maize line using particle bombardment.Furthermore, WO 93/07278 and Koziel et al. (1993 Biotechnology 11:194-200) describe techniques for the transformation of elite inbredlines of maize by particle bombardment. This technique utilizes immaturemaize embryos of 1.5-2.5 mm length excised from a maize ear 14-15 daysafter pollination and a PDS-1000He Biolistics device for bombardment.

Transformation of rice can also be undertaken by direct gene transfertechniques utilizing protoplasts or particle bombardment.Protoplast-mediated transformation has been described for Japonica-typesand Indica-types (Zhang et al. 1988 Plant Cell Rep 7: 379-384; Shimamotoet al. 1989 Nature 338: 274-277; Datta et al. 1990 Biotechnology 8:736-740). Both types are also routinely transformable using particlebombardment (Christou et al. 1991 Biotechnology 9: 957-962).Furthermore, WO 93/21335 describes techniques for the transformation ofrice via electroporation. Patent Application EP 0 332 581 describestechniques for the generation, transformation, and regeneration ofPooideae protoplasts. These techniques allow the transformation ofDactylis and wheat.

Furthermore, wheat transformation has been described by Vasil et al.(1992 Biotechnology 10: 667-674) using particle bombardment into cellsof type C long-term regenerable callus, and also by Vasil et al. (1993Biotechnology 11: 1553-1558) and Weeks et al. (1993 Plant Physiol. 102:1077-1084) using particle bombardment of immature embryos and immatureembryo-derived callus.

One technique for wheat transformation involves the transformation ofwheat by particle bombardment of immature embryos and includes either ahigh sucrose or a high maltose step prior to gene delivery. Prior tobombardment, any convenient number of embryos (0.75-1 mm in length) canbe plated onto MS medium with 3% sucrose (Murashiga & Skoog 1962Physiologia Plantarum 15: 473-497) and 3 mg/l 2,4-D for induction ofsomatic embryos, which is allowed to proceed in the dark. On the chosenday of bombardment, embryos are removed from the induction medium andplaced onto the osmoticum (i.e. induction medium with sucrose or maltoseadded at the desired concentration, typically 15%). The embryos areallowed to plasmolyze for 2-3 h and are then bombarded. Twenty embryosper target plate is typical, although not critical. An appropriategene-carrying plasmid (such as pCIB3064 or pSG35) is precipitated ontomicrometer size gold particles using standard procedures. Each plate ofembryos is shot with the DuPont Biolistics® helium device using a burstpressure of 1000 psi using a standard 80 mesh screen. After bombardment,the embryos are placed back into the dark to recover for about 24 h(still on osmoticum). After 24 hrs, the embryos are removed from theosmoticum and placed back onto induction medium where they stay forabout a month before regeneration. Approximately one month later theembryo explants with developing embryogenic callus are transferred toregeneration medium (MS+1 mg/liter NAA, 5 mg/liter GA), furthercontaining the appropriate selection agent (10 mg/l basta in the case ofpCIB3064 and 2 mg/l methotrexate in the case of pSOG35). Afterapproximately one month, developed shoots are transferred to largersterile containers known as “GA7s” which contain half-strength MS, 2%sucrose, and the same concentration of selection agent.

Transformation of monocotyledons using Agrobacterium has also beendescribed. See, WO 94/00977 and U.S. Pat. No. 5,591,616. Ricetransformation using Agrobacterium has been described in a number ofpublications, including Hiei et al. 1994 Plant J. 6:271-282, Dong et al.1996 Molecular Breeding 2:267-276, and Hiei et al. 1997 Plant MolecularBiol. 35:205-218. Efficient maize transformation using Agrobacteriuminfection of immature embryos and various selection markers also hasbeen described (Ishida et al. Nature Biotechnology 14:745-750; Negrottoet al. 2000 Plant Cell Reports 19:798-803; and Li et al. 2003 PlantPhysiol. 133:736-747).

C. Transformation of Plastids

Seeds of Nicotiana tabacum c.v. ‘Xanthi nc’ are germinated seven perplate in a 1 inch circular array on T agar medium and bombarded 12-14days after sowing with 1 μm tungsten particles (M10, Biorad, Hercules,Calif.) coated with DNA from plasmids pPH143 and pPH145 essentially asdescribed (Svab and Maliga 1993 PNAS 90: 913-917). Bombarded seedlingsare incubated on T medium for two days after which leaves are excisedand placed abaxial side up in bright light (350-500 μmol photons/m²/s)on plates of RMOP medium (Svab, Hajdukiewicz, and Maliga 1990 PNAS 87:8526-8530) containing 500 μg/ml spectinomycin dihydrochloride (Sigma,St. Louis, Mo.). Resistant shoots appearing underneath the bleachedleaves three to eight weeks after bombardment are subcloned onto thesame selective medium, allowed to form callus, and secondary shootsisolated and subcloned. Complete segregation of transformed plastidgenome copies (homoplasmicity) in independent subclones is assessed bystandard techniques of Southern blotting (Sambrook et al. (1989)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory,Cold Spring Harbor). BamHI/EcoRI-digested total cellular DNA (Mettler1987 Plant Mol Biol Reporter 5: 346-349) is separated on 1% Tris-borate(TBE) agarose gels, transferred to nylon membranes (Amersham) and probedwith ³²P-labeled random primed DNA sequences corresponding to a 0.7 kbBamHI/HindIII DNA fragment from pC8 containing a portion of the rps7/12plastid targeting sequence. Homoplasmic shoots are rooted aseptically onspectinomycin-containing MS/IBA medium (McBride et al. 1994 PNAS 91:7301-7305) and transferred to the greenhouse.

The foregoing describes various embodiments of the invention and is notintended to limit the scope of the invention as defined in the appendedclaims. The following Examples are included merely to demonstrate thepractice of selected embodiments and should be regarded in anillustrative, rather than a restrictive, manner.

EXAMPLES

Standard recombinant DNA and molecular cloning techniques used here arewell known in the art and are described by Ausubel (ed.), CURRENTPROTOCOLS IN MOLECULAR BIOLOGY, John Wiley and Sons, Inc. (1994); J.Sambrook, et al., MOLECULAR CLONING: A LABORATORY MANUAL, 3d Ed., ColdSpring Harbor, N.Y.: Cold Spring Harbor Laboratory Press (2001); and T.J. Silhavy, M. L. Berman, and L. W. Enquist, EXPERIMENTS WITH GENEFUSIONS, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984).

Example 1 Construction of a Modified Neomycin Phosphotransferase II(nptII) Gene with Four Arabidopsis thaliana Introns

To introduce four Arabidopsis thaliana introns into the neomycinphosphotransferase II gene (nptII), which confers kanamycin resistance,individual introns and nptII coding sub-regions (artificial exons) wereamplified with PCR and then combined by a second round of PCR to formhybrid fragments containing Arabidopsis intron-nptII exon cassettes.Each cassette was cloned individually and combined using standard DNArecombination methods. NptII exon 1 was amplified using primers NPTFA(SEQ ID NO:1: 5′-GAT CTC TAG AAT GAT TGA ACA AGA TGG ATT-3′) and NPTRA(SEQ ID NO:2. 5′-TCG CAG CTT GGT ACC TGC AGT TCA TTC AGG GC-3′) frompCIB200 (Rothstein et al., Gene 53:153-161, 1987). The PCR product wasdigested with XbaI/PstI and inserted into XbaI/PstI-digested pNOV2799 toform pNOV2711. pNOV2799 was derived from pNOV205 by replacing theSacII/XbaI polylinker with the SpeI/XbaI polylinker from pLITMUS28 (NewEngland Biolab). pNOV204 is a pBluescript vector containing the Smaspromoter (Ni et al. 1996 Plant J. 7: 661-676.) The intron in theuntranslated leader of AtBAF60 was amplified from A. thaliana ecotypeColumbia DNA with primers IntBAFFW (SEQ ID NO:3: 5′-GCC CTG AAT GAA CTGCAG GTA CCA AGC TGC GA-3′) and IntBAFRV (SEQ ID NO:4: 5′-GCC GCG CTG CCTCGT CCT GAA AAA TTC AGA AA-3′). AtBAF60 (CHCl) is a gene that shareshomology with the mammalian nucleosome-remodeling factor BAF60(http://www.chromdb.org/). NptII exon 2 was amplified from pCIB200 usingprimers NPTF2 (SEQ ID NO:5: 5′-TTT CTG AAT TTT TCA GGA CGA GGC AGC GCGGC-3′) and NPTR2 (SEQ ID NO:6: 5′-GAA TAG TAC TAA TAC CTG GCA CTT CGCCCA ATA G-3′). A PAL1 intron was amplified from Arabidopsis thalianaecotype Landsberg erecta using primers IntPALFW (SEQ ID NO:7: 5′-TTA GTACTA TTC TTT TGT TCT CTA ATC AGA-3′) and IntPALRV (SEQ ID NO:8: 5′-TGACAG GAG ATC CTG CCC TGT AAC GAA CAA AAA CAT-3′). NptII exon 3 wasamplified from pCIB200 using primers NPTFC (SEQ ID NO:9: 5′-ATG TTT TTGTTC GTT ACA GGG CAG GAT CTC CTG TCA-3′) and NPTR3 (SEQ ID NO:10: 5′-ATCGAT TCA TAT ATA TAC CTG GTC GAC AAG ACC GGC-3′). A tubulin-1-β intron(760 bps) was amplified from Arabidopsis thaliana ecotype Columbia withprimers IntTUBFW (SEQ ID NO:11: 5′-CAG GTA TAT ATA TGA ATC GAT TTC TCCCTT-3′) and IntTUBRV (SEQ ID NO:12: 5′-TCG TCC AGA TCA TCC TGT AAT ACAGAA ATG TT-3′). NptII exon 4 was amplified from pCIB200 (Rothstein etal. 1987 Gene 53:153-161) with primers NPTFD (SEQ ID NO:13: 5′-AAC ATTTCT GTA TTA CAG GAT GAT CTG GAC GA) and NPTR4 (SEQ ID NO:14: 5′-GGA AAAGCT TAA TTA CCT CGC CGT CGG GCA TG-3′). A tubulin-1-α intron (560 bps)was amplified from Arabidopsis thaliana ecotype Columbia with primersIntTUAFW (SEQ ID NO:15: 5′-GTA ATT AAG CTT TTC CAC CTC TCT TGT T-3′) andIntTUARV (SEQ. ID NO:16: 5′-GAT CCT GCA GCA ATG GAA AAA TAT TTC AATAC-3′). NptH exon 5 was amplified from pCIB200 with primers NPTFE (SEQID NO:17: 5′-ATT GCT GCA GGA TCT CGT CGT GAC CCA TGG-3′) and NPTR5 (SEQID NO:18: 5′-CAT TAG GAT CCT CAG AAG AAC TCG TCA A-3′). All of the abovePCR products were gel purified and used as templates for a second roundof PCR amplification. All PCR reactions were carried out with a mixtureof Taq polymerase and Pfu polymerase (30 to 1, unit/unit) in aPerkin-Elmer thermocycler 9600.

Purified AtBAF60 intron and nptII exon 2 PCR fragments were used astemplates for amplification with primers IntBAFFW and NPTR2, and theresulting PCR product was cloned into a pCR2.1-TOPO vector to formpNOV2708. The PAL1 intron and nptII exon 3 were amplified using primersIntPALFW and NPTR3, and the PCR product was cloned into pCR2.1-TOPO toform pNOV2709. The tubulin-1-β intron and nptII exon 4 PCR fragmentswere amplified using primers IntTUBFW and NPTR4, and the resulting PCRproduct was cloned into pCR2.1-TOPO to form pNOV2710. The tubulin-1-αintron and nptII exon 5 fragments were co-amplified using IntTUAFW andNPTR5 primers, and the resulting PCR product was inserted intopCR2.1-TOPO to form pNOV2712. Plasmid pNOV2708 was digested partiallywith BglII. A linker was formed by annealing two oligonucleotides,FRTBGL2 (SEQ ID NO:19: 5′-GAT CTG AAG TTC CTA TTC TCT AGA AAG TAT AGGAAC TTC G-3′) and FRTBAM1 (SEQ ID NO:20: 5′-GAT CCG AAG TTC CTA TAC TTTCTA GAG AAT AGG AAC TTC A-3′). This linker, which contained an FRT site,was inserted into the BglII site in the AtBAF60 intron to form pNOV2715.The PAL1 gene 3′-UTR was amplified from Arabidopsis thaliana ecotypeLandsberg erecta using primers TPALBGLII (SEQ ID NO:21: 5′-TGT TAA GATCTT AGT CCT CTG TTT TTT TCT-3′) and TPALSACI (SEQ ID NO:22: 5′-CTT GAGCTC TTC TAT AAC CCT AGA TGG CTA-3′). The PAL1 3′-UTR PCR product wasdigested with BglII and Sad and then inserted into BglII/SacI-digestedpLITMUS28 to form pNOV2707. All the inserts in the above clones weresequenced to ensure that no mutations were introduced into the codingsequence.

The individual intron-nptII exon cassette was then recombined to form afull-length modified nptII gene. The AtBAF60 intron-nptII exon 2fragment was removed from pNOV2715 by partial PstI and complete XhoIdigestion and inserted into PstI/XbaI-digested pNOV2711 to formpNOV2718. The tubulin-1-β intron-exon 4 fragment was released frompNOV2710 by ClaI/HindIII double-digestion and inserted intoClaI/HindIII-digested pNOV2709 to form pNOV2716, containing PAL1intron-nptII exon 3-tubulin-1-β intron-nptII exon 4. A BglII/SacIfragment containing the PAL1 3′-UTR was removed from pQD7A1 and insertedinto BamHI/SacI-digested pNOV2712 to form pNOV2717. The 1.5 kbXhoI/HindIII and 1.1 kb HindIII/SacI fragments containing intron-exoncassettes were removed from pNOV2716 and pNOV2717, respectively, andinserted into XhoI/SacI-digested pBluescript II KS(+) to form pNOV2719.Plasmid pNOV2719 was digested with Sad and ScaI, and the 2.6 kbSacI/ScaI fragment containing intron-exons and PAL 3′-UTR was isolatedinto pNOV2718 partially digested with SacI and ScaI to form pNOV2720.Plasmid pNOV2720 then contained the full-length modified nptII gene withfour Arabidopsis introns (FIG. 14) under the control of a modified superMAS (mSmas) promoter.

Example 2 Construction of a Control Vector for Dicot Plants

The 4489 base pair SacI/XhoI fragment containing the mSmaspromoter-modified nptII-PAL1 3′ end cassette was removed from pNOV2720and inserted into pNOV2722, which was partially digested with Sad andXhoI to form control construct pNOV2731 (FIG. 16). pNOV2731 wastransformed into Agrobacterium LBA4404, and the resulting Agrobacteriumstrain was used to transform both Arabidopsis and tobacco plants.Phosphinothricin (PPT) resistant transformants produced seeds that werehighly resistant to kanamycin. The results demonstrate that the modifiednptII gene is fully functional and the introns can be spliced outefficiently.

Example 3 Construction of Target Vectors for Dicot Plants

The coding region for the bar gene, which provides resistance to theherbicide Basta®, was amplified from pGSFR1 (D'Halluin et al. 1992Methods Enzymol. 216: 415-26) using two primers, BARCLA (SEQ ID NO:23:5′-TCA TAT CGA TGA GCC CAG AAC GAC GCC-3′) and BARBGL (SEQ ID NO:24:5′-TTT GAG ATC TTC ATA TCT CGG TGA CGG GCA GG-3′). The gel-purified PCRproduct was digested with BglII and inserted into SmaI/BamHI-digestedpHSPnos to form pNOV2703. pHSPnos is a pSPORT1 base vector (GIBCO BRL,Rockville, Md.) containing the Brassica HSP80 promoter (see U.S. Pat.No. 5,612,472) followed by the nopaline synthase terminator (Bevan etal. 1983 Nucleic Acids Res. 11, 369-385). pNOV2703 was digested withNotI, filled in with a Klenow fragment, and then digested again withXhoI to isolate the 2.4 kb NotI/XhoI fragment containing the BrassicaHSP80 promoter-bar-Tnos expression cassette. Binary base vector pHINK078was digested with ApaI, filled in with a Klenow fragment, and then cutwith XhoI. The above-described 2.4 kb NotI/XhoI HSP80promoter-bar-nos3′UTR fragment was inserted into ApaI/XhoI-digestedpHINK078 to form pNOV2797. pNOV2797 was digested with BglII, filled-inwith a Klenow fragement, and religated to form pNOV2706. The SacI/NcoIpolylinker (88 bps) from pNOV2799 was inserted into SacI/NcoI-digestedpNOV2706 to form pNOV2722. pNOV2722 was cut with BglII and then ligatedwith BglII/BamHI-digested DNA fragments containing a recognitionsequence for endonuclease I-SceI, I-CeuI, or HO to form pNOV2723(I-SceI), pNOV2724 (I-CeuI), and pNOV2725 (HO), respectively. The DNAfragment containing the I-SceI site was synthesized by annealingoligonucleotides ISCEBAM1 (SEQ ID NO:25: 5′-ACT TGG ATC CAT ATT ACC CTGTTA TCC CTA-3′) and ISCEBGL2 (SEQ ID NO:26: 5′-TCG AAG ATC TGC TAG GGATAA CAG GGT AAT-3′), filled-in with a Klenow fragment of E. coli DNApolymerase I, and then digested with BglII and BamHI. A DNA fragment forI-CeuI was synthesized similarly with oligonucleotides ICEUBGL2 (SEQ IDNO:27: 5′-TCG AAG ATC TCT ATA ACG GTC GTA AGG TAG-3′) and ICEUBAM1 (SEQID NO:28: 5′-ACT TGG ATC CTC GCT ACC TTA GGA CCG TTA-3′). The DNAfragment for the HO site was synthesized with oligonucleotides HOBGL2(SEQ ID NO:29: 5′-TCG AAG ATC TAG CTT TCC GCA ACA GTA TAA-3′) and HOBAM1(SEQ ID NO:30: 5′-ACT TGG ATC CAT TAT ACT GTT GCG GAA AGC-3′). pNOV2720was digested with BglII and Sad to isolate a 3054 by BglII/SacI fragmentcontaining truncated modified nptII-PAL1-3′-UTR. This fragment wasinserted into BglII/SacI-digested pNOV2723, pNOV2724, and pNOV2725 toform pNOV2700 (with I-SceI site), pNOV2729 (with I-Ceul site), andpNOV2701 (with HO site), respectively (FIG. 13C).

Example 4 Generation of Tobacco Plants Containing a Target Construct

Agrobacterium tumefaciens strain LBA4404 containing target constructspNOV2700, pNOV2701, pNOV2729, or control plasmid pNOV2731, respectively,were used to infect leaf explants of Nicotiana tabacum c.v. Petit Havana(SR1). Transgenic plants were obtained from the infected leaf explantsusing PPT (5 mg/L) as a selection agent. Initially, the tobacco leaveswere cut into 1-2 mm wide slices, exposed to the Agrobacteriumresuspended in MS3S for 5 minutes, and then moved to sterile paper toblot away excess liquid and placed on co-cultivation medium (MS3S+NAA(0.1 mg/L)+6-BA (1 mg/L)+gelrite agar (2.4 g/L)) for 3 days. The leafslices were then moved to selection/regeneration medium (MS3S+NAA (0.1mg/L)+6-BA (1 mg/L)+Carbenicillin (200 mg/L)+gelrite agar (2.4 g/L)+PPT(5 mg/L)). PPT resistant shoots were rooted in selection/rooting medium(MSB+PPT (5 mg/L)+Carbenicillin (200 mg/L)+phytagar (8 g/L) in GA-7boxes) and then transplanted to soil. As a control, pNOV2731 was placedin rooting medium that included kanamycin (150 mg/L) as well as PPT (5mg/L) to screen for the expression of the modified nptII gene with fourintrons. The plants were selfed or outcrossed with pollens fromnontransgenic SR1 plants to produce seeds.

Example 5 Molecular Analysis of Transgenic Plants

The DNA of the transgenic plants was analyzed in accordance withstandard molecular biological techniques. DNA was isolated from theleaves of transgenic plants for T-DNA structure analysis using the CTABprotocol (Jorgensen et al. 1996 Plant Mol. Biol. 31:957-973). Samplescontaining about 5 micrograms of tobacco DNA were digested with arestriction enzyme, such as SacI, NheI, SpeI, KpnI, Seal, HpaI, EcoRI,and EcoRV, separated on an agarose gel, blotted onto Hybond N+ nylonmembrane, and then hybridized with a ³²P-labeled probe. The probes wereprepared from either a PPT fragment or the nptII exon 5/Pal terminatorfragment, as appropriate.

Example 6 Construction of Donor Vectors

pNOV2704 was digested with NotI, blunted with Klenow, cut with XbaI, andligated with 3.1 kb KpnI/XbaI (blunted) of pNOV2705 containing the UBQ35′ region (promoter, intron, and leader)-Hyg-TUBG3 to create pNOV2726.pLITMUS28 (New England Biolabs, MA) was digested with BglII and ligatedwith a fragment containing an FRT site derived from annealedoligonucleotides FRTBGL2 (SEQ ID NO:31: 5′-GAT CTG AAG TTC CTA TTC TCTAGA AAG TAT AGG AAC TTC G-3′) and FRTBAM1 (SEQ ID NO:32: 5′-GAT CCG AAGTTC CTA TAC TTT CTA GAG AAT AGG AAC TTC A-3′) to create plasmidpNOV2727. pNOV2727 was digested with XhoI, filled-in with a Klenowfragment in the presence of dTTP only, then cut with Sad to isolate a2.8 kb XhoI/SacI fragment. pNOV2720 was cut with ClaI, filled-in with aKlenow fragment, then digested with Sad to isolate the 4.8 kb SacI/ClaIfragment. The 2.8 kb XhoI/SacI fragment of pNOV2727 was ligated with the4.8 kb ClaI/SacI fragment of pNOV2720 to create pNOV2732.

pNOV2700 was digested partially with EcoRV and XhoI to isolate the 10 kbEcoRV/XhoI fragment and then ligated with 3.2 kb Sad (blunted)/XhoIfragment of pNOV2726 to create pNOV2733. pNOV2733 was digested with Sad,blunted with T4 DNA polymerase, and then partially cut with BglII toisolate the 10.4 kb BglII/SacI (blunted) fragment. pNOV2732 was digestedwith NcoI, filled-in with a Klenow fragment, and then partially cut withBglII to isolate the 3.97 kb BglII/NcoI (blunted) fragment. The donorconstruct pNOV2736 (FIG. 13D) was created by ligating the 10.4 kbBglII/SacI(blunted) fragment with the 3.97 kb BglII/NcoI fragment.

pNOV2729 was digested partially with EcoRV and XhoI to isolate the 10 kbEcoRV/XhoI fragment. The fragmant was ligated with a 3.2 kb SacI/XhoIfragment of pNOV2726 (SacI site was blunted by a Klenow treatment) tocreate pNOV2734.

pNOV2734 was digested with Sad, blunted with T4 DNA polymerasetreatment, then partially cut with BglII to isolate the 10.4 kbSacI/BglII fragment. This fragment was ligated with the 4 kb NcoI/BglIIfragment (NcoI was blunted by a Klenow fragment) of pNOV2732 to createdonor construct pNOV2737 (FIG. 13D).

pNOV2701 was digested partially with EcoRV and XhoI to isolate the 10 kbEcoRV/XhoI fragment. This fragment was ligated with the 3.2 kb SacI/XhoIfragement of pNOV2726 (Sad site was blunted by Klenew treatment) tocreate pNOV2735.

pNOV2735 was digested with Sad, blunted with T4 DNA polymerase, thenpartially cut with BglII to isolate the 10.4 kb SacI/BglII fragment.This fragment was ligated with the 4 kb NcoI/BglII fragment (NcoI sitewas blunted) of pNOV2732 to create donor construct pNOV2738.

pNOV2734 was digested partially with Ecl136II and BglII to isolate a10.4 kb Ecl136II/BglII fragment. This fragment was ligated to a 2.5 kbSalI (Blunted)/BglII fragment of pNOV2732 to form donor constructpNOV2755 (FIG. 13D). The pNOV2734 Ecl136II/BglII (10.4 kb) fragment wasligated with a 1.54 kb MscI/BglII fragment of pNOV2732 to form pNOV2756.The pNOV2734 Ecl136II/BglII (10.4 kb) fragment was ligated with 1.42 kbEcoRI (blunted)/BglII fragment of pNOV2732 to form pNOV2757 (FIG. 13D).

pNOV2733 was digested partially with Ecl136II and BglII to isolate a10.4 kb Ecl136II/BglII fragment. The fragment was ligated with 3.97 kbNcoI (blunted with Klenow)/BglII fragment of pNOV2732 to form binarydonor pNOV2759.

Example 7 Construction of an HO Endonuclease Expression Vector for DicotPlants

The coding region of the yeast HO endonuclease gene was amplified fromSaccharomyces cereviceae (ATCC48893) using primers HOATG (SEQ ID NO:33:5′-CTA CTG TCG ACA AAA ATG CTT TCT GAA AAC-3′) and HOBAMH (SEQ ID NO:34:5′-CTA GGA TCC GAC CTG GTC GTC ACA GTA GCT-3′), and the PCR product wascloned into the pCR2.1-TOPO vector to form pNOV2741. pNOV2741 wasdigested partially with SalI and BamHI, and the SalI/BamHI fragmentcontaining the HO gene was inserted into (SalI)partial/BamHI-digestedpNOV2721 to form pNOV2742. The Act2 promoter-HO-act2 terminator cassettewas excised from pNOV2742 by KpnI and Sad digestion and was insertedinto KpnI/SacI-digested pHINK078 to form binary vector pNOV2747 (FIG.13E). The HO expression cassette was also excised from pNOV2742 by KpnIand Sad digestion and inserted into KpnI/SacI-cut pCIB100 (Rothstein etal. 1987 Gene 53:153-161) to form pNOV036.

Example 8 Construction of a Synthetic I-CeuI Gene with Maize-PreferredCodons

The amino acid sequence for the homing endonuclease I-CeuI (Gauthier,Turmel, and Lemieux 1991 Curr. Genet. 19: 43-47) was back-translatedinto the DNA sequence shown in SEQ ID NO:35 using maize-preferred codons(see U.S. Pat. No. 6,121,014). The unique restriction endonuclease cutsite EagI was identified within this DNA sequence, which allowed the DNAto be cloned as two separate segments or sub-fragments of 340 by and 346bp. Because expression of the I-CeuI endonuclease is toxic to E. coli,an intron was introduced into the 5′-segment before excision andligation of the segments to form the complete gene. A 189-bp potatoST-LS1 intron sequence (Narasimhulu et al. 1996 Plant Cell 8:873-886)was also inserted into I-CeuI to facilitate cloning in E. coli. Each ofthe two sub-fragments was constructed from oligonucleotides ranging from65 to 75 bases in length, with each oligonucleotide overlappingneighboring oligonucleotides by 20 bp.

Segment 1 of synthetic I-CeuI (SynICeuI) included the first 335 bypreceding the EagI site and was constructed from the followingoligonucleotides: 1A (SEQ ID NO:36: 5′-GGGGA TCCAT GAGCA ACTTC ATCCTGAAGC CCGGC GAGAA GCTGC CCCGG ACAAG CTGGA GGAGC TGAAG AAGA-3′) (GG+BamHIsite+top strand bases 1-67), 1B (SEQ ID NO:37: 5′-CGCAG GTCGA TCAGGTACTTGCTGA AGTTC TTGGT CTTCT TCACG GCGTCGTTGA TCTTC TTCAG CTCCT CCAGC-3′)(bottom strand bases 48-122), 1C (SEQ ID NO:38: 5′-AAGTA CCTGA TCGACCTGCG CAAGC TGTTC CAGAT CGACG AGGTG CAGGT GACCA GCGAG AGCAA GCTGTTCCTG-3′) (top strand bases 103-177), 1D (SEQ ID NO:39: 5′-TGG CCA GCTTCT TGG TGC TGA TGT TCA GGC TGG CCT CGC CCT CCA GGA AGC CGG CCA GGA ACAGCT TGC TCT CGC-3′) (bottom strand bases 158-232), 1E (SEQ ID NO:40:5′-CAGCA CCAAG AAGCT GGCCA CCAGC AAGTT CGGCC TGGTG GTGGA CCCCG AGTTCAACGT GACCC AGCAC GTGAA-3′) (top strand bases 213-287), and 1F (SEQ IDNO:41: 5′-CGCAG GTCGA TCAGG TACTT GCTGA AGTTC TTGGT CTTCT TCACG GCGTCGTTGAT CTTCT TCAGC TCCTC CAGC-3′) (bottom strand bases 268-335+5′CCC).

Segment 1 was constructed in three steps: (1) a Klenow fill-in reactionto form three sets of dimers (AB, CD, and EF); (2) a PCR joining ofdimers CD and EF to form a tetramer CDEF; and (3) a second PCR joiningof tetramer CDEF to dimer AB, forming hexamer ABCDEF. Three reactions of50 μl containing 1×DNA polymerase salts and 1 μl each of 20 μM solutionof 1A and 1B, 1C and 1D, and 1E and 1F, respectively, were heated at 67°C. for 5 minutes and then allowed to cool slowly to 22° C. To eachreaction was added 1 μl of a mix of four deoxynucleotide triphosphates(10 mM each), plus 2 μl (10 units) of a Klenow fragment of DNApolymerase (New England Biolabs). The reaction was incubated at 22° C.for 15 minutes, producing AB, CD, and EF precursors of SynICeuIsegment 1. Segment CD was joined to overlapping EF by 10 cycles of PCR.A PCR reaction mixture containing 13 μl water, 5 μl each of the CD andEF Klenow reactions, and 1 μl each of the 20 μM solutions of oligo 1Cand 1F as primers was added to a Ready-to-Go PCR bead (AmershamPharmacia Biotech Inc). The PCR reaction conditions were: 95° C. for 5minutes; (95° C. for 1 min., 56° C. for 30 sec., 72° C. for 1 min.) 10cycles; 72° C. for 10 min. The yield of tetrameric product was increasedby reamplification of the product of this reaction as follows: A new PCRreaction mixture containing 18 μl water, 5 μl of product of the previousPCR reaction, and 1 μl each of the 20 μM solutions of oligo 1C and 1F asprimers was added to a Ready-to-Go PCR bead, and the amplificationprogram described above was re-employed. The tetrameric PCR product wasexcised from an agarose minigel (2% Seaplaque agarose), and the DNA waspurified by the QIAquick Gel Extraction Kit (Qiagen, Vanecia, CA91355).

In order to form the hexameric product, the PCR-mediated joining processwas repeated using tetramer CDEF plus dimer AB with oligonucleotides 1Aand 1F as primers. The resulting hexameric DNA fragment was isolated andpurified as described above and then cloned using the TOPO-TA CloningKit (InVitrogen, Carlsbad, Calif.). Clones with hexamer-sized insertswere sequenced to identify one of perfect sequence, which is referred toas pCR2.1SynICeuI-1. For assembly of the complete synthetic gene, thefragment was ultimately excised from the TOPO vector with BamHI andEagI, but only after introduction of an intron (see below).

Segment 2 was constructed from the following oligonucleotides: 2G (SEQID NO:42: 5′-CCC CGG CCG CAT CCG CCA CAA GAG CGG CAG CAA CGC CAC CCT GGTGCT GAC CAT CGA CAA CCG CCA GAG CCT GGA-3′), 2H (SEQ ID NO:43: 5′-CTCGGG GCT GCT GAA GGC CAC CAC GTA CTG CTC GTA GAA GGG GAT CAC CTT CTC CTCCAG GCT CTG GCG GTT GTC-3′), 2I (SEQ ID NO:44: 5′-TGG CCT TCA GCA GCCCCG AGA AGG TGA AGC GCG TGG CCA ACT TCA AGG CCC TGC TGG AGC TGT TCA ACAACG ACG-3′), 2J (SEQ ID NO:45: 5′-ATC TGG TCC CAG ATG GGC AGG ATC TTGTTC ACC AGC TGC TCC AGG TCC TGG TGG GCG TCG TTG TTG AAC AGC TCC-3′), 2K(SEQ ID NO:46: 5′-CTG CCC ATC TGG GAC CAG ATG CGC AAG CAG CAG GGC CAGAGC AAC GAG GGC TTC CCC AAC CTG GAG GCC GCC CAG-3′), and 2L (SEQ IDNO:47: 5′-GGG GAA TTC CTA CTT GAT GCC CTT CTT GTA GTT GCG GGC GAA GTCCTG GGC G,GC CTC CAG GTT GG-3′). In a manner similar to that describedabove for segment 1, segment 2 was constructed in three steps: (1) aKlenow fill-in reaction to form three sets of dimers (GH, IJ, and KL);(2) a PCR joining of dimers EF and GH to form a tetramer, EFGH; and (3)a second PCR joining of EFGH with IJ to form a hexamer, EFGHIJ.

The hexamer DNA fragment GHIJKL was cloned into pCR2.1 using the TOPO-TACloning Kit (InVitrogen) and was sequenced to identify a clone ofperfect sequence, which is referred to as pCR2.11CeuI-2.

Introduction of an Intron into pCR2.1SynICeuI-1

The potato ST-LS1 intron was PCR-amplified from pBISN1 (Narasimhulu etal. 1996 Plant Cell 8:873-886) using an oligonucleotide primer pair(i.e., SEQ ID NO:48: 5′-GGGTA CGTAA GTTTC TGCTT CTACC TTTG-3′ and SEQ IDNO:49: 5′-CCCCAG CTGCA CATCA ACAAA TTTTG GTC-3′) to form SnaBl and PvuIIsites (shown in bold) at the 5′ and 3′ ends of the intron, respectively.The PCR product was cloned using the TOPO-TA Cloning Kit (Invitrogen),and a perfect copy, referred to as pInt1, was identified throughsequencing. The intron was excised from pInt1 as a SnaB1/PvuII fragment,gel-purified, and then extracted from agarose with the QIAquick gelextraction kit. pCR2.1SynICeuI-1 was cleaved at a unique PmlI site inthe insert and, in accordance with methods known in the art, was treatedwith alkaline phosphatase under appropriately stringent conditions forachieving blunt-end dephosphorylation. The intron fragment was ligatedinto this vector, and candidate clones were screened by Apol digestionand sequenced to confirm a clone of perfect sequence with the intron inthe correct orientation with respect to the coding sequence of ICeuI.The plasmid so-identified is referred to as pCRSynICeuI-1-int.

Assembly of the SynICeuI Gene

Plasmid pBluescript KS(+) (Stratagene, Inc.) was digested with NotI andEcoRI in the presence of alkaline phosphatase. The 3′ end of SynICeuIwas excised from pCR2.1IceuI-2 with EagI and EcoRI, gel-purified, andligated to the bluescript vector, forming pBS-GHIJKL. Because the EagIsite of the insert is a half NotI site, the NotI site was reconstitutedin the product. This plasmid was next cleaved with NotI in the presenceof alkaline phosphatase, and the 5′ end of SynICeuI, including theintron excised as an EagI fragment from pCRSynICeuI-1-Int, was ligatedinto place. Candidate clones were sequenced to identify one with theABCDEIntF fragment inserted in the correct orientation. The identifiedclone is referred to as pBS-ICeuI-Int. The sequence of SynICeuI isrepresented by SEQ ID NO:35, which shows the flanking noncoding DNAbetween the EcoRI sites in italics.

SEQ ID NO: 35. I-CeuI endonuclease with maize-preferredcodons and potato ST-LS1 intronGAATTCGCCCTTGGGGATCCATGAGCAACTTCATCCTGAAGCCCGGCGAGAAGCTGCCCCAGGACAAGCTGGAGGAGCTGAAGAAGATCAACGACGCCGTGAAGAAGACCAAGAACTTCAGCAAGTACCTGATCGACCTGCGCAAGCTGTTCCAGATCGACGAGGTGCAGGTGACCAGCGAGAGCAAGCTGTTCCTGGCCGGCTTCCTGGAGGGCGAGGCCAGCCTGAACATCAGCACCAAGAAGCTGGCCACCAGCAAGTTCGGCCTGGTGGTGGACCCCGAGTTCAACGTGACCCAGCACGTAAGTTTCTGCTTCTACCTTTGATATATATATAATAATTATCATTAATTAGTAGTAATATAATATTTCAAATATTTTTTTCAAAATAAAAGAATGTAGTATATAGCAATTGCTTTTCTGTAGTTTATAAGTGTGTATATTTTAATTTATAACTTTTCTAATATATGACCAAAATTTGTTGATGTGCAGGTGAACGGCGTGAAGGTGCTGTACCTGGCCCTGGAGGTGTTCAAGACCGGCCGCATCCGCCACAAGAGCGGCAGCAACGCCACCCTGGTGCTGACCATCGACAACCGCCAGAGCCTGGAGGAGAAGGTGATCCCCTTCTACGAGCAGTACGTGGTGGCCTTCAGCAGCCCCGAGAAGGTGAAGCGCGTGGCCAACTTCAAGGCCCTGCTGGAGCTGTTCAACAACGACGCCCACCAGGACCTGGAGCAGCTGGTGAACAAGATCCTGCCCATCTGGGACCAGATGCGCAAGCAGCAGGGCCAGAGCAACGAGGGCTTCCCCAACCTGGAGGCCGCCCAGGACTTCGCCCGCAACTACAAGAAGGGCATCAAGTAG GAATTC

Example 9 Construction of a Dicot I-CeuI Endonuclease Expression Vector

The pBH37 plasmid, an expression vector containing a modified Smaspromoter, a Nos terminator, and cloning sites between these two regions,was digested with BglII, and the BglII site was converted to an MfeIsite by the introduction of the following site conversionoligonucleotide: (SEQ ID NO:50: 5′-GATCGGCAATTGCC-3′). The resultingplasmid, pBH37M, was digested with MfeI in the presence of alkalinephosphatase. SynICeuI was excised from its bluescript vector as an EcoRIfragment and was ligated into MfeI-cleaved pBH37M. Candidate clones weredigested with BstEII/PstI, and a clone having a correctly orientedfragment containing SynICeuI appropriately flanked by the Smas promoterand the Nos terminator was chosen for further cloning into a binaryvector. This fragment, referred to as Smas-ICeuI-Int, was excised as aHindIII/EcoRI fragment, ligated into pHINK078, and then digested withHinndIII/EcoRI in the presence of alkaline phosphatase to form pNOV039.Binary vector pNOV100 was digested with HindIII/EcoRI in the presence ofalkaline phosphatase, and the HindIII/EcoRI purified fragment ofSmas-ICeuI-Int was ligated with it to form pNOVO40.

Example 10 Targeted Integration into a Predetermined Target Loci byHomologous Recombination

Single copy T-DNA transgenic tobacco target lines (T2701.6 and T2701.27)were selected and infected with Agrobacterium tumefaciens strainLBA4404, which contained a donor vector. Seeds derived from target linesT2701.6 and T2701.27 that had been selfed or backcrossed withuntransformed SR1 pollens were germinated on MS3S medium with 5 mg/LPPT. Two different methods for generating targeted events were used. Inone method, PPT resistant seedlings were grown in MS3S medium for 3-4weeks. Leaves of 3 to 6 week old seedlings were used for targetingexperiments. The leaves were cut into 1-mm wide slices, exposed for 5minutes to Agrobacterium resuspended in MS3S, moved to sterile paper toblot away excess liquid, and then placed on co-cultivation medium(MS3S+NAA (0.1 mg/L)+6-BA (1 mg/L)+gelrite agar (2.4 g/L) in standardPetri dishes) for 3 days. The leaf slices were then moved toselection/regeneration medium (MS3S+NAA (0.1 mg/L)+6-BA (1mg/L)+Carbenicillin (200 mg/L)+gelrite agar (2.4 g/L) with kanamycin(200 mg/L)). Kanamycin-resistant shoots were rooted in selection/rootingmedium (MSB+PPO (100 nM)+Carbenicillin (200 mg/L)+phytagar (8 g/L) inGA-7 boxes) and then transplanted to soil. PPT-resistant 9-14 days oldseedlings were used for Agrobacterium-mediated transformation usingvacuum-infiltration according to the method described in Puchta et al.1996 Proc. Natl. Acad. Sci. USA 93:5055-5060. Kanamycin-resistant shootswere further verified by PCR analysis.

Table 1 shows the efficiency of targeted integration in three targetlines. Co-delivery of an HO expression vector (pNOV2747 or pNOV036) andan I-CeuI expression vector (pNOV039 or pNOVO40) does not increasetargeting efficiency. Overall, up to a 1-2% targeted integrationefficiency was obtained. It is believed that the insertion of the 4Arabidopsis introns in the nptII gene, which extended the region ofhomology between the target and donor DNA, contributed to the observedtargeting efficiency. The enhancing effect on targeting of a longerregion of homology is further substantiated by comparing the effect ofthree different donor vectors (pNOV2736, pNOV2755, pNOV2757) on thetargeting efficiency in both line T2701.6 and T2701.27 (Table 1). Onaverage, 1 to 3 targeted events can be obtained with donor pNOV2736,which flanks both sides of the Hyg cassette with 2.4 kb of sequencehomology with the target, but no event was obtained with pNOV2757, whichflanks one side of the marker with 2.4 kb of sequence homology with thetarget and the other side of the marker has no homology to the target.

TABLE 1 Targeting efficiency of two single-copy lines with differentvectors Vector(s) Homology Explants Events PCR+ South+ Target lineT2701.6 HR-01AB pNOV2737 2.4 & 2.4 kb 237 3 2 2 HR-01AC pNOV2737, 27472.4 & 2.4 kb 277 1 1 1 HR-01AD pNOV2755, 2747 2.4 & 1.2 kb 233 4 2 2HR-02AA pNOV2736, 036 2.4 & 2.4 kb 347 0 0 0 HR-02AC pNOV2755, 036 2.4 &1.2 kb 303 1 1 1 HR-03AB pNOV2736 2.4 & 2.4 kb 119 2 2 2 HR-03ADpNOV2737 2.4 & 2.4 kb  91 2 1 1 HR-05AA pNOV2736 2.4 & 2.4 kb 247 5 53/5* HR-05AC pNOV2755 2.4 & 1.2 kb 194 2 1 1 HR-05AD pNOV2757 2.4 & 0 kb204 3 0 0 HR-06AA pNOV2736 2.4 & 2.4 kb 183 2 1 1 HR-06AB pNOV2737 2.4 &2.4 kb 179 1 1 1 HR-11AA^(#) pNOV2736 2.4 & 2.4 kb  100^(#) 3 3 NDTarget line T2701.27 HR-01CA pNOV2737 2.4 & 2.4 kb 169 5 1 1 HR-01CCpNOV2737, 2747 2.4 & 2.4 kb 268 4 3 3 HR-02CA pNOV2736, 036 2.4 & 2.4 kb259 1 0 0 HR-02CC pNOV2755, 036 2.4 & 1.2 kb 211 1 1 1 HR-05CA pNOV27362.4 & 2.4 kb 227 8 6 3/3* HR-05CC pNOV2755 2.4 & 1.2 kb 183 8 7 2/2*HR-05CD pNOV2757 2.4 & 0 kb 175 2 0 0 HR-06CB pNOV2737 2.4 & 2.4 kb 1931 1 ND HR-11CA^(#) pNOV2736 2.4 & 2.4 kb  100^(#) 4 1 ND Target lineT2729.26 HR-09CC pNOV2736 2.4 & 2.4 kb 141 1 1 ND HR-09CE pNOV2736 +pNOV040 2.4 & 2.4 kb 183 0 0 ND HR-09CF pNOV2736, pNOV039 2.4 & 2.4 kb185 1 1 ND HR-12CA pNOV2736 2.4 & 2.4 kb 100 1 1 ND HR-12CB pNOV2736 +pNOV040 2.4 & 2.4 kb 100 2 2 ND HR-12CC pNOV2736, pNOV039 2.4 & 2.4 kb100 1 1 ND ND: not determined. *Number of events analyzed. ^(#)13days-old young seedlings instead of leaf explant tissues were used fortransformation.

Example 11 Identification of Recombinant Target Lines

Leaf tissue was collected from these potential recombinants for DNAisolation and PCR analysis (FIG. 14A). In order to identifyrecombinants, amplification was carried out using the BoehringerMannheim Expand™ High Fidelity PCR system with primers PSMASFW2 (SEQ IDNO:51: 5′-CCG GTG AGT AAT ATT GTA CGG CTA AGA-3′) and NPTR6 (SEQ IDNO:52: 5′-AGA TCC TCA GAA GAA CTC GTC AAG AAG-3′). Amplification ofrecombinant junctions was carried out using Boehringer Mannheim Expand™Long Template PCR system with primers PHSPFWD (SEQ ID NO:53: 5′-AAT ATAGGC GGT ATT CCG GCC ATT ATA ACA-3′) and TPalExonV (SEQ ID NO:54: 5′-CTAAGA TCC TCA GAA GAA CTC GTC AAG AAG-3′). FIG. 14B illustrates theidentification of targeted events by PCR amplification. FIG. 14C showsrecombinants that have successfully integrated a second gene cassette.

FIGS. 15A, 15B, and 15C illustrate Southern blot analyses of targetedevents achieved through homologous recombination. Genomic DNA digestedwith several enzymes (EcoRV, SacI, NheI, SpeI) was hybridized with twodifferent probes (i.e., the HSP80 promoter and the nptII exonV-PAL1terminator). The positions of the probes are indicated in FIG. 15A. TheHSP80 promoter probe provided information relating to the copy number ofthe donor sequence (FIG. 15B) and whether recombination occurred at theleft end of the target locus (i.e., the end which included the bar genecassette). Probing with the nptII exonV-PAL1 terminator providedinformation regarding target sequence copy number and whether there wasany rearrangement at the right end of the target locus (i.e., the endwhich included the nptII gene cassette) (FIG. 15C). If a recombinationevent derives from recombination at both ends (double crossoverrecombination), then both probes would be expected to show a shift inthe target bands. If a recombination event derives from recombination ata single side (a single crossover recombination), then the target bandwould be expected to shift with only one probe. If the putativelytargeted event is not truly targeted, then none of the bands in thetarget plant would be expected to shift with either of the probes. FIGS.15B and 15C illustrate an exemplary analysis of this type.

Target line T2701.6 (lanes 1 to 4, FIGS. 15B and 15C) gave rise toseveral restriction fragments that were easily separated by regular gelelectrophoresis. FIGS. 15B and 15C (lanes 5-8) show that recombinantHR-03AD.2 had a restriction fragment size-shift, which is consistentwith a double-sided recombination event. The other two events (HR-05AA.1and HR-05AA.2, lanes 9-14) show a band shift that is consistent withrecombination at only one side. One of the events (HR-05AA.2, lanes12-14) shows band shifts with both probes, but the band with NheIdigestion (lane 14) is smaller than expected, so there might be somerearrangement or deletion close to the HSP80 promoter. Because there isno restriction polymorphism in the region of homology between the targetand donor sequences, it is not possible to distinguish whether therecombinants were derived from a reciprocal crossover or anon-reciprocal gene conversion.

Target line T2701.27 gave rise to restriction fragments that were noteasily separated (larger than 10 kb NheI, Sad, and ScaI fragments withthe nptII exon V-PAL 3′-UTR probe), and events derived from this lineare analyzed only minimally, due to difficulty in distinguishing smallchanges for band sizes larger than 15 kb in normal agarose gelelectrophoresis.

Other targeted events were also characterized by Southern blot analysis.The results are summarized as follows: (1) About 70% of the targetedevents resulted from single crossover recombination, and about 30% ofthe recombinants resulted from double crossover recombination. It is notknown whether recombinants were the product of a reciprocal crossover ora non-reciprocal gene conversion process using incoming T-DNA as atemplate. (2) About half of the recombinants had additional copies ofthe donor sequence inserted elsewhere in the genome. (3) T-DNA iscapable of carrying out homologous recombination (by either reciprocalcrossover or non-reciprocal gene conversion), and it does not have to beintegrated into the host genome first. (4) No unexpected rearrangementof the target locus or ectopic targeting is observed in all of theanalyzed events.

Because a Southern blot analysis will not reveal rearrangements thatresult in relatively small changes in the size of a band, finerrestriction mapping of the recombination breakpoints was done. A primer(PFDSP1, FIG. 14A; SEQ ID NO:55: 5′-ACC CTC CGC TAC TTC TCC GGG AAA AGACGC-3′) was created based upon the flanking plant genomic DNA sequencesobtained from I-PCR and used to perform long range PCR amplification innine recombinant lines derived from T2701.6. PCR amplification was doneusing two primer pairs. The first pair of primers (PFDSP1 and TPalExonV,FIGS. 14A and 14C; SEQ ID NO:55 and SEQ ID NO:54) produced a 5.5 kbproduct from a non-targeted copy, a larger than 5.5 kb product from atargeted copy derived from single-sided recombination, and a 10 kbfragment from a targeted copy derived from double-sided recombination.More particularly, using this pair of primers (PFDSP1 and TPalExonV), aca.10 kb fragment was obtained from one hemizygous recombinant line(HR-03AD.2, see FIG. 14C for examples). A ca. 9 kb fragment was derivedfrom both HR-05AA.2 and HR-01AB.1 and their progeny (FIG. 14C). A ca. 8kb fragment was amplified from line HR-01AD.1, which is hemizygous forthe target transgene locus (not shown). Thus, both HR-01AB.1 andHR-01AD.1 were not the products of a double-crossover recombination. Insix other heterozygous lines, only the shorter fragment (5.5 kb) waspresent, as was predicted from the preferential amplification of thenon-targeted copy (result not shown). When the kanamycin resistantprogeny of a heterozygous recombinant HR-03AB.1 was subjected to PCR, a10 kb fragment was produced (FIG. 14C).

Using another pair of primers (PFDSP1, SEQ ID NO:55 and HygRV1, SEQ IDNO:56: 5′-ACT ATC GGC GAG TAC TTC TAC ACA GCC ATC-3′) FIG. 14C, lane 1to 5), the PCR reaction produced a 5.4 kb product. This indicated thatthe recombinant was derived from double-sided recombination, because theHygRV1 primer could only bind to the Hyg gene present in the donorvector. The 5.4 kb product was present in both hemizygous andheterozygous recombinants derived from double-crossover recombination. A5.4 kb fragment was obtained from 5 recombinant lines (HR-01AB.1,HR-02AC.1, HR-03AB.1, HR-03AD.2, and HR-05AA.2). No PCR product wasderived from five other targeted events (HR-01AB.3, HR-01AC.1,BR-01AD.1, HR-01AD.4, and HR-03AB.2). In these latter recombinants, itis possible that a DNA rearrangement or a repeat structure was present,such that the PCR reaction was unable to amplify the entire region.Since both HR-01AB.1 and HR-05AA.2 produced a PCR product of only about9 kb using PFDSP1 and TPalExonV (above) but produced a PCR product ofabout 5.4 kb with PFDSP1 and HygRV1, it is possible that there was aninternal rearrangement (such as a deletion, for example) between thehygromcyin phosphotransferase (HPT) gene and the mSmas promoter duringtargeting. In summary, Southern blot analyses and PCR resultsdemonstrate that at least three events (HR-02AC.1, HR-03AB.1, andHR-03AD.2) were derived from double-crossover recombination with noadditional rearrangement.

Example 12 Progeny Analysis of Targeted Events

In several recombinants, more than one copy of a donor sequence wasintegrated into the host cell's genome, as indicated by Southern blotanalysis using the HSP80 promoter probe. To study the insertion statusof the additional copy(ies) in these lines, plants were pollinated withuntransformed SR1. The seeds were plated on PPT, kanamycin, orhygromycin medium. Table 2 shows the number of resistant and sensitiveseedlings. In a hemizygous target line, half of the seedlings would beexpected to be resistant to PPT, kanamycin, and hygromycin, if all donorcopies are integrated into either a single locus or a closely linkedlocus. Here, all lines had the expected kanamycin resistance segregationratios, as demonstrated by Southern blot analysis of each plant line.Southern blot analysis indicated that there were several additionalcopies of the donor sequence present in the HR-01AB.1 genome. The PPTand hygromycin segregation data supported this conclusion.

TABLE 2 Progeny segregation analysis of targeted events Kan PPT HygCrosses R S R S R S Hemizygous target* T2701.6 target locus HR-01AB.1 ×SR1 56 64 121 10 80 9 HR-03AD.2 × SR1 115 111 50 58 38 35 T2701.27target locus HR-01CB.4 × SR1 43 46 37 42 21 21 Homozygous target*T2701.6 target locus HR-01AB.3 × SR1 46 57 154 0 NT HR-03AB.1 × SR1 2836 78 0 40 37 T2701.27 target locus HR-01CC.4 × SR1 65 80 72 0 58 62*The target status is extrapolated from Southern blot analysis using nptexonV/PAL 3′-UTR as probe. The plants in bold font are most likelyderived from double crossover recombination as indicated by Southernblot analysis. NT: Not tested.

Example 13 Construction of a Site-Specific FLP Recombinase ExpressionVector

A 1.6 kb BamHI fragment containing FLP recombinase was excised frompUCFLP/intron (WO 99/55851) and inserted into pNOV2721 linearized withBamHI to create pNOV2760, thereby placing FLP under the control of theArabidopsis Act2 promoter. pNOV2760 was digested with Sad and KpnI toisolate the 3.7 kb Act2 promoter/FLP-intron/Act2 terminator cassette.This fragment was then inserted into SacI/KpnI-digested pNOV1511 tocreate pNOV2762 (FIG. 13E). The PPO gene was isolated from Arabidopsisthaliana, and two mutant amino acids were introduced to obtain PPO(dm)(U.S. Pat. No. 6,308,458), which permitted the selection of transgeniccells with an herbicide (butafenacil, CGA 854,276).

Example 14 Generation of Transgenic Lines Expressing FLP Recombinase

The FLP recombinase binary vector pNOV2762 was transformed intoAgrobacterium strain LBA4404, and the resulting Agrobacterium strain wasused to transform tobacco SR1 as described above, with the exceptionthat butafenacil was used as the selection agent. Several FLP-expressinglines were generated. One transgenic line (T2762.2) was selfed toproduce progeny lines T2762.2S1 and T2762.2S2.

Example 15 Crossing Recombinant Plant Lines with FLP-Expressing PlantLines

To regenerate a truncated nptII (with introns) selectable marker forgene stacking, recombinant line HR-03AD.2, which resulted fromdouble-crossover recombination, was chosen for further studies.HR-03AD.2 was crossed with both T2762.2S1 and T2762.2S2 directly. Theseeds from these crosses were plated on medium containing bothbutafenacil and hygromycin. Double-resistant seedlings were transplantedto soil and grown in a greenhouse. Each seedling was analyzed with a PCRassay to determine whether there was a FLP-mediated excision of thesequence flanked by the two FRT sites. The PCR assay (FIG. 16A) wasperformed with a forward primer (Tubq3FW; SEQ ID NO:57, 5′-GTG TCT CATGCA CTT GGG AGG TGA TC-3′) located at the Ubq3 terminator and a reverseprimer at the nptII exon 3 (NPTR3, SEQ ID NO:10). The wild type targetlocus produced a 3 kb PCR fragment; the same target locus with the Smaspromoter and part of the nptII sequence (i.e., exon 1 and part ofintron 1) excised by FLP-mediated site-specific recombination produced a1.5 kb PCR fragment (see FIG. 16A). 72 progeny seedlings were assayed byPCR. 49 of those seedlings had a detectable 1.5 kb PCR fragment. Severallines with an excised nptII sequence (CFP-A7, CFP-B8, CFP-B11, CFP-C3,CFP-C6, CFP-D1, CFP-D5, CFP-E7, and CFP-E9) were crossed with SR1, andthe progeny were selected on hygromycin. Hygromycin-resistant seedlingswere then assayed by PCR to recover progeny with FLP-mediated excision.

Example 16 Retransformation of Recombinants with a FLP Expression Vector

Truncation of the nptII marker sequence can also be achieved byinserting the recombinase expression construct into the target lines andthen allowing the recombinase locus to be lost through segregation.Kanamycin-resistant seedlings resulting from crossing HR-03AB.1 with SR1and HR-03AD.2 with SR1 were re-transformed with Agrobacterium containingpNOV2762 to regenerate a truncated selectable marker gene for genestacking. In preparation for retransformation with pNOV2762, leaf sliceswere infected with Agrobacterium (pNOV2762) and then selected onhygromycin and butafenacil. Regenerated shoots were rooted in mediumwith butafenacil. The rooted shoots were transplanted into soil andassayed by PCR, as described above, to determine whether the mSmaspromoter and part of the npaI (with introns) gene were deleted (FIG.16A). Among 44 independent transformants (HR-08AA's) of HR-03AB.1×SR1kanamycin-resistant seedlings, 19 had a 1.5 kb PCR product. Among 44transformants (HR-08BA's) of HR-03AD.2×SR1 kanamycin-resistantseedlings, 22 had a 1.5 kb PCR product.

Several lines with an excised nptII sequence (HR-08AA.17, HR-08AA.32,HR-08BA.13 and HR-08BA.20) were crossed with SR1, and the progeny wereselected on hygromycin. Hygromycin-resistant seedlings were assayed byPCR to recover progeny with FLP-mediated excision. To facilitate theisolation of lines with complete excision of the nptII sequence, leavesof HR-08AA32 were regenerated. One of the regenerants, HR-08AA32R2, hadcomplete excision of FRT-flanked sequences and was pollinated withpollens from wildtype untransformed SR1. The progeny seedlings weretested for excision of the FRT-flanked mSmas and nptII sequences by PCR.PCR demonstrated that all of the progeny had an mSmas promoter and partof the nptII sequence had been excised (FIG. 16B). Progeny with theregenerated truncated marker gene is then capable of being used to stackadditional donor trait cassettes.

The above Examples describe the production of a recombinant line throughsite-specific recombination-mediated excision of nucleotide sequencesflanked by recombinase recognition sequences. This recombinant line,which includes a partially excised (i.e., truncated) selectable markergene, can be re-used in further rounds of targeting, thereby permittingthe use of a single selectable marker gene in combination with asite-specific recombination system to stack an unlimited number of genecassettes (i.e., donor sequences) at a single locus in the genome of ahost cell (see, e.g., FIGS. 12A and 12B).

Example 17a Construction of a PMI Gene (PMI-Intron) with FourArabidopsis Introns

Four Arabidopsis introns were inserted into the phosphomannose isomerase(PMI) gene to form a PMI-intron sequence (FIG. 17), thereby increasingthe length of the PMI gene from 1103 by to 3452 bp. These four intronsare from an AtBAF60 homolog, phenylalanine aminolyase (PAL),tubulin-1-alpha, and tubulin-1-beta, as used for constructing the nptIIintrons in pNOV2720. An R recombinase recognition sequence (RS) wasformed by annealing two complementary primers RSFW (SEQ ID NO:58: 5′-GATCCG CGG TTG ATG AAA GAA TAA CGT ATT CTT TCA TCA A-3′) and RSRV (SEQ IDNO:59: 5′-GAT CTT GAT GAA AGA ATA CGT TAT TCT TTC ATC AAC CGC G-3′) andinserting them into BglII-digested pNOV2720 to form pNOV2783. PMI intron1 (488 bps, from AtBAF60 intron) was amplified from pNOV2783 withPMIINTFA (SEQ ID NO:60: 5′-ATG CCG CAG GTA CCA AGC TGC GAA TCT TCG-3′)and PMIINTRA (SEQ ID NO:61: 5′-ATC GGG ATA CCT GAA AAA TTC AGA AACAAA-3′). The other three introns were amplified directly from pNOV2720.PMI intron 2 (from the Arabidopsis PAL1 intron) was amplified frompNOV2720 with PMIINTFB (SEQ ID NO:62: 5′-CGG TCG CAG GTA TTA GTA CTA TTCTTT TGT-3′) and PMIINTRB (SEQ ID NO:63: 5′-CGG ATG TGC ACC TGT AAC GAACAA AAA CAT-3′). PMI intron 3 (from the Arabidopsis tubulin-1-betaintron) was amplified from pNOV2720 with PMIINTFC (SEQ ID NO:64: 5′-ACCTGC AAG GTA TAT ATA TGA ATC GAT TTC-3′) and PMIINTRC (SEQ ID NO:65:5′-GCG CCA CAC CTG TAA TAC AGA AAT GTT AAG-3′). PMI intron 4 (from theArabidopsis tubulin-1-alpha intron) was amplified from pNOV2720 usingPMIINTFD (SEQ ID NO:66: 5′-GTG AAA CAA GGT TAT TAA CGT TTT CCA CCT-3′)and PMIINTRD (SEQ ID NO:67: 5′-GTT CTG CAC CTG CAT CAA TGG AAA AATATT-3′). PMI exons 1-5 were amplified from pNOV210, which contains theE. coli PMI coding sequence in pBluescript KS(+) (Stratagene, La Jolla,Calif.). PMI exon 1 (346 bps) was amplified from pNOV210 with PMIEXF1(SEQ ID NO:68: 5′-GTG GAT CCG GCA GCA TGC AAA AAC TCA TTA ACT-3′) andPMIEXR1 (SEQ ID NO:69: 5′-TCG CAG CTT GGT ACC TGC GGC ATT TTC TTTGG-3′). PMI exon 2 (140 bps) was amplified from pNOV210 using PMIEXF2(SEQ ID NO:70: 5′-AAT TTT TCA GGT ATC CCG ATG GAT GCC GCC-3′) andPMIEXR2 (SEQ ID NO:71: 5′-TAG TAC TAA TAC CTG CGA CCG GCT GGA GTA-3′).PMI exon 3 (290 bps) was amplified from pNOV210 with PMIEXF3 (SEQ IDNO:72: 5′-GTT CGT TAC AGG TGC ACA TCC GGC GAT TGC-3′) and PMIEXR3 (SEQID NO:73: 5′-TCA TAT ATA TAC CTT GCA GGT AAG CGT GCG-3′). PMI exon 4(146 bps) was amplified from pNOV210 with PMIEXF4 (SEQ ID NO:74: 5′-CTGTAT TAC AGG TGT GGC GCT GGA AGT GAT-3′) and PMIEXR4 (SEQ ID NO:75:5′-TGT TAA TAA CCT TGT TTC ACC GGC TGG GTC-3′). PMI exon 5 (283 bps) wasamplified from pNOV210 with PMIEXF5 (SEQ ID NO:76: 5′-CGA TTG ATG CAGGTG CAG AAC TGG ACT TCC C-3′) and PMIEXR5 (SEQ ID NO:77: 5′-TGC TCG AGTCAT TAG CAA GAG ATG TTA ATT TT-3′).

PMI intron 1 (488 bps) and PMI exon 2 (140 bps) PCR fragments wereco-amplified using PMIintFA and PMIEXR2 to form a PMI intron 1::PMI exon2 PCR fragment (630 bps). PMI exon 1 (346 bps) and PMI intron 1:: PMIexon 2 fragment (630 bps) were co-amplified with PMIEXF1 and PMIEXR2 toform a PMI exon 1::PMI intron 1::PMI exon 2 fragment (976 bps), whichwas then cloned into pCR2.1-TOPO (Invitrogen, Carlsbad, Calif.) to formpNOV2784. PMI intron 2 (449 bps) and PMI exon 3 (290 bps) PCR productswere co-amplified with PMIINTFB and PMIEXR3 primers to from a PMI intron2::PMI exon 3 fragment (740 bps). PMI exon 3 (290 bps) and PMI intron 3(792 bps) were co-amplified with PMIEXF3 and PMIINTRC primers to form aPMI exon 3:: PMI intron 3 fragment (1083 bps). PMI intron 4 (511 bps)and PMI exon 5 (283 bps) PCR products were co-amplified with PMIINTFDand PMIEXR5 to form a PMI intron 4::PMI exon 5 fragment (795 bps). PMIexon 2 (140 bps) and PMI intron 2::PMI exon 3 (740 bps) wereco-amplified with PMIEXF2 and PMIEXR3 primers to form a PMI exon 2::PMIintron 2:: PMI exon 3 fragment (881 bps). PMI intron 1 (488 bps) and thePMI exon 2::PMI intron 2::PMI exon 3 fragment (881 bps) wereco-amplified with PMIINTFA and PMIEXR3 to form a PMI intron 1::PMI exon2::PMI intron 2::PMI exon 3 PCR product (1370 bps), which was thencloned into pCR2.1-TOPO to form pNOV2785. A PMI exon 3::PMI intron 3fragment (1083 bps) and PMI exon 4 (146 bps) were co-amplified usingPMIEXF3 and PMIEXR4 to form a PMI exon 3::PMI intron 3::PMI exon 4fragment (1230 bps), which was inserted into pCR2.1-TOPO to formpNOV2786. The PMI exon 4 fragment (146 bps) and the PMI intron 4::PMIexon 5 fragment (795 bps) were co-amplified with PMIEXF4 and PMIEXR5primers to form a PMI exon 4::PMI intron 4::PMI exon 5 fragment (942bps), which was inserted into pCR2.1-TOPO to form pNOV2787. pQD84A1 waspartially digested with Sad and Seal to isolate a 4910 bps vectorfragment. pQD85B9 was cut with Seal and Sad to isolate the 789 byfragment, which was inserted into the 4910 by pQD84A1 SacI/ScaI vectorfragment to form pNOV2788. BstBI/BamHI-digested pQD86A13 was ligatedwith a BstBI/BamHI fragment (894 bps) of pQD87A19 to form pNOV2789.XhoI/BamHI-digested pBluescript KS(+) (Stratagene, La Jolla, Calif.) wasligated with the BssHI/BamHI fragment (1540 bps) of pQD88A1 and theBssHIIXhoI fragment (1928 bps) of pQD89A7 to form pNOV2790. pNOV2790contained the full-length PMI sequence with four Arabidopsis intronsinserted into pBluescript KS(+).

Example 17b Construction of a GUS Gene with an AtBAF60 Intron

To construct a GUS gene with an Arabidopsis intron from the AtBAF60gene, an AtBAF60 intron (420 bps) was amplified from the Arabidopsisgenome using two primers, GUSBAFFW1 (SEQ ID NO:78: 5′-TTG ACT GGC AGGTAC CAA GCT GCG AAT CTT CG-3′) and GUSBAFRV1 (SEQ ID NO:79: 5′-ATT GGCCAC CAC CTG AAA AAT TCA GAA ACA AA-3′). AtBAF60 (CHCl) is a gene thatshares homology with the mammalian nucleosome-remodeling factor BAF60(http://www.chromdb.org/). GUS exon 1 (645 bps) was amplified frompBI121 (Clonetech) using two primers, GUSBAMHI (SEQ ID NO:80: 5′-GGA TCCAAC CAT GTT ACG TCC TGT AGA AA-3′) and BAFGUSRV1 (SEQ ID NO:81: 5′-CAGCTT GGT ACC TGC CAG TCA ACA GAC GCG AC-3′). GUS exon 2 (1200 bps) wasamplified from pBI121 using two primers, BAFGUSFW1 (SEQ ID NO:82 5′-TTGACT GGC AGG TAC CAA GCT GCG AAT CTT CG-3′) and GUSSALI (SEQ ID NO:83:5′-GTC GAC TCA TTG TTT GCC TCC CTG CTG CGG-3′). The GUS exon 1-AtBAF60intron fragment (1049 bp) was formed by PCR using gel-purified GUS exon1 (645 bp) and the AtBAF60 intron (420 bp) fragments as a template andtwo primers, GUSBAMHI (SEQ ID NO:84: 5′-GGA TCC AAC CAT GTT ACG TCC TGTAGA AA-3′) and GUSBAFRV1 (SEQ ID NO:85: 5′-ATT GGC CAC CAC CTG AAA AATTCA GAA ACA AA-3′). The GUS exon 1::AtBAF60 intron fragment (1049 bp)was cloned into the pCR2.1-TOPO vector to form pNOV5001. The AtBAF60intron-GUS exon 2 fragment (1620 bp) was formed by PCR using the AtBAF60intron (420 bp) and GUS exon 2 (1200 bp) fragments as templates andGUSBAFFW1 (SEQ ID NO:86: 5′-TTG ACT GGC AGG TAC CAA GCT GCG AAT CTTCG-3′) and GUSSALI (SEQ ID NO:87: 5′-GTC GAC TCA TTG TTT GCC TCC CTG CTGCGG-3′) as primers. The AtBAF60 intron::GUS exon 2 fragment (1620 bp)was cloned into pCR2.1-TOPO to form pNOV5002. pNOV5003 was formedthrough a tripartite ligation of XhoI/BamHI-digested pBluescript KS(+)with two insert fragments, pNOV5001 BamHI/HindIII fragment (961 bp) andpNOV5002 XhoI/HindIII fragment (1312 bps).

Example 18 Construction of a Monocot Expression Vector Including aPMI-Intron Sequence

Binary backbone vector pNOV2114 was digested with HindIII and Asp718I.The ZmUbi promoter/Nos 3′-UTR fragment was excised from pBH16 as aHindIII/Asp718I fragment and ligated into this vector to form pNOVO44.The pBH16 construct contained the ZmUbi promotor-intron linked to theNos (nopaline synthase) 3′-UTR by a linker with BamHI and SacI sites.pNOV2790 was digested with BamHI/BglII, and the 3011 by fragmentcontaining the 3′-remainder of the coding region was isolated. Thisfragment was then ligated into BamHI-digested pNOVO44 to form pNOVO42,which contained the 5′-truncated PMI-intron sequence. pNOV2790 was alsodigested with AflII, and an oligonucleotide converter was ligated intothe site to change it into BamHI: TTAACGGATCCG, producing pQD90C2BamHI.This plasmid was digested with BamHI, and the 2832 by fragmentcontaining the 5′-remainder of the coding region was isolated. Thisfragment was ligated into the BamHI site of pNOVO44 to form pNOVO43,which contained the 3′-truncated PMI-intron sequence. pNOV2790 wasdigested with BamHI, and the 3011 by fragment containing the full-lengthPMI gene was isolated. This PMI fragment was ligated into BamHI-digestedp2114UbiNos to form pNOVO41, which contained the full-length PMI-intronsequence.

Example 19 Construction of a PPO-dm Selectable Marker Cassette forMonocots

The rice actin 1 promoter (McElroy et al. 1990 Plant Cell 19902:163-171) was used to drive PPO-dm expression as a selectable marker.PPO-dm is a mutant form of the Arabidopsis PPO gene, which conferstolerance to the herbicide butafenacil. pNOV3010 is a biolistic fragmentvector containing a rice actin 1 promoter-PMI expression cassette.pNOV3010 was partially digested with BamHI and filled-in with a Klenowfragment of E. coli DNA polymerase Ito destroy the BamHI site in theintron of the 5′-region of the rice actin 1 gene, thus forming pNOV5004.The 2175 by rice actin promoter sequence was removed from pNOV5004 byBamHI/PstI digestion and was inserted into BamHI/PstI-digestedpBluescript KS(+) to form pNOV5012. pNOV5012 was digested with BamHI,filled-in with a Klenow fragment, partially cut with Sad, and thentreated with calf intestine phosphatase to isolate the 5.1 kb vector.pNOV1511 (U.S. Pat. No. 6,308,458) was digested with NcoI, filled-inwith a Klenow fragment, and then digested with Sad to isolate the 1898by PPO-dm::35S terminator fragment. This PPO-dm::35S terminator fragmentwas then inserted into the above pNOV5012 vector (5.1 kb) to formpNOV5013.

Example 20 Construction of a Monocot Target Vector with a PPO HerbicideResistance Marker Gene

Two oligonucleotides, ICEUBGL2 (SEQ ID NO:88: 5′-TCG AAG ATC TCT ATA ACGGTC CTA AGG TAG-3′) and ICEUBAMH (SEQ ID NO:89: 5′-ACT TGG ATC CTC GCTACC TTA GGA CCG TTA-3′), were annealed, filled-in with a Klenowfragment, and digested with BglII and BamHI to isolate a fragmentcontaining I-CeuI cleavage site. The isolated I-CeuI site fragment wasinserted into BglII-digested pNOV2790 to form pNOV5006. pNOV5013 wasdigested with PspOMI, filled-in with a Klenow fragment, then partiallycut with BamHI to isolate the 4069 by rice Actl promoter::PPO::35Sterminator fragment. pNOV5014 was digested with SbfI, blunted with T4DNA polymerase, and then cut with BglII to isolate the 8972 by fragment.The 4069 by rice Act1 promoter::PPO::35S terminator fragment of pNOV5013was inserted into the SbfI/BglII vector fragment (8972 bps) of pNOV5014to form target vector pNOV5025 (FIG. 17A). pNOV5014 was constructed byinserting the BglII/SpeI fragment (3034 bp) of pNOV5006 intoBglII/SpeI-digested pNOVO41.

Example 21 Construction of a Monocot Target Vector with the HygromycinPhosphotransferase (hpt) Gene as an Antibiotic Resistance Marker

Target binary vector pADF55 was constructed by the following steps andwas used to produce target plants through hygromycin selection. Themethod herein described may be used with any monocot plant and any of avariety of tranformation methods, as described above. In this example,however, rice plants and Agrobacterium-mediated transformation were used(Hiei et al. Plant Journal 6:271-282).

Step 1: pAdF50 containing a new promoter-gene fusion (the rice Actin 1promoter fused to the hygromycin gene) was built through a 3-wayligation of (1) pNEB193 cut with SalI and SphI, (2) the 2212 bySaiI-BamHI fragment of pNOV1100 containing the rice Actin 1 promoter,and (3) a 1029 by BamHI-SphI PCR fragment containing the hygromycin geneamplified from pNOV11. Tthe PCR primers for this reaction contained theBamHI and SphI sites: the PCR primer containing the SphI site alsocontained an additional restriction site, NotI, located between the SphIsite and the 3′ end of the hygromycin gene, which restriction site waslater used to excise the hygromycin gene.

Step 2: pAdF51 was derived from pAdF50 by adding the CMPS:GIG:Act23′-UTR cassette of pQD189A12 and the attP recognition sequence (a phagelambda integrase recognition sequence). pAdF51 was built through a 3-wayligation of (1) pAdF50 cut with Pad and XbaI, (2) the 3224 by KpnI-XbaIfragment of pQD189A12 carrying the CMPS:GIG: Act2 3′ gene, and (3) a 260by PacI-KpnI PCR product carrying the attP recognition site, which wasamplified from pQD188A7. The PCR primers used for this reactioncontained the Pad and KpnI sites.

Step 3: pAdF52, a subclone of the EcoO109I-AscI fragment of pNOV5025,which contains a 35S terminator and the truncated PMI-introns::nos 3′gene fragment, was inserted into pNEB193. The construct was made bysubcloning the 4864 by EcoO109I/blunt with Klenow-AscI fragment ofpNOV5025 into vector pNEB193 cut with EcoO109I/blunt with Klenow andAscI.

Step 4: pAdF53 was constructed by insterting the phage lambda integraserecognition site attB into pAdF52, downstream of the 35S terminator,using an NcoI site. The attB sequence was added using two annealedoligonucleotides with ends that were compatible with an NcoI site. ABspHI site was also included in the oligo sequence to convenientlyassist in identifying particular clones that contained it.

Step 5: pAdF54 was constructed by adding the Zygosaccharomyces rouxii Rrecombinase recognition site (RS) to plasmid pAdF53, downstream of thetruncated PMI::nos 3′ gene fragment, in a KpnI site. The RS sequence wasadded using two annealed oligos with ends that were compatible with aKpnI site. An Agel site was also included in the oligo sequence toconveniently assist in identifying particular clones that contained it.

Step 6: The final construct, pAdF55, was built through a 3-way ligationof the 5684 by PacI-AscI vector fragment of pQD199B2 ligated to the 6723by PacI-NotI fragment of pAdF51 and the 3566 by NotI-AscI fragment ofpAdF54.

Example 22 Construction of a Monocot Positive Control Vector

pNOV5013 was digested with PspOMI, filled-in with a Klenow fragment, andthen partially cut with BamHI to isolate the 4069 by rice Act1promoter::PPO::35S terminator fragment. pNOV5015 was digested with SbfI,blunted with T4 DNA polymerase, and then partially cut with BglII toisolate the 11.5 kb vector fragment. The 4069 by rice Act1promoter::PPO::35S terminator fragment was inserted into the SbfI/BglIIvector fragment (11.5 kb) of pNOV5015 to form pNOV5026, the positivecontrol vector for targeting (FIG. 17A).

Example 23 Construction of Monocot Donor Vectors

pNOVO41 was digested with AscI, filled-in with a Klenow fragment, andthen cut with BamHI to isolate the 7.7 kb AscI/BamHI fragment. pNOV5006was digested with BstBI, filled-in with a Klenow fragment, cut withBamHI to isolate the 2652 by BstBI/BamHI fragment. pNOV5016 wasconstructed by ligating the AscI/BamHI fragment (7.7 kb) of pNOVO41 withthe BstBI/BamHI fragment (2652 bp) of pNOV5006. pNOV5013 was digestedwith PspOMI, filled-in with a Klenow fragment, and then partially cutwith BamHI to isolate the 4069 by rice Act1 promoter::PPO::35Sterminator fragment. pNOV5016 was digested with SbfI, blunted with T4DNA polymerase, and then partially cut with BglII to isolate the 10.3 kbvector fragment. The 4069 by rice Act1 promoter::PPO::35S terminatorfragment was inserted into an SbfI/BglII-digested pNOV5016 vector (10.3kb) to form pNOV5027. pNOV5027 was partially digested with SrfI andBsrGI to delete part of the rice Act1 promoter, filled-in with a Klenowfragment, and then circularized to form pNOV5030. pNOV5030 was digestedwith PacI, filled-in with a Klenow fragment, and then partially cut withSalI to isolate a 13,768 by PacI/SalI fragment as a vector. pNOV5019 isa plasmid derived from pBluescript KS(+) containing a rice α-tubulinpromoter::GFPintron::AtAct2 terminator expression cassette. TheEcl136II/XhoI fragment (3286 bp) of pNOV5019 was inserted into the13,768 by PacI/SalI vector fragment to form a first monocot donorvector, pNOV5031 (FIG. 17B). pNOV5030 was cut with PacI, blunted with aKlenow fragment, partially digested with PspOMI, and thendephosphorylated with CIP to isolate a 13.8 kb PacI/PspOMI vectorfragment. pNOV5044 was cut with Ecl136II and PspOMI to isolate a 3525 byEcl136II/PspOMI insert fragment. pNOV5044 contained a CMPSpromoten:GUSbafintron::AtAct2-3′-UTR expression cassette in apBluescript backbone. The above-described PacI/PspOMI fragment (13.8 kb)of pNOV5030 was ligated with the Ecl136II/PspOMI fragment (3535 bp) ofpNOV5044 to form a second monocot donor vector, pNOV5045 (FIG. 17B).

Donor vectors with attB and attP sites were also constructed. Thesevectors allowed the use of phage lambda integrase to excise theintervening DNA sequences and thereby regenerate the selectable markertarget site to permit gene stacking. To do this, complementary oligosATTB1 (SEQ ID NO:90: 5′-GAT CCG CTC AAG TTA GTA TAA AAA AGC AGG CTT CATGA-3′) and ATTB2 (SEQ ID NO:91: 5′-GAT CTC ATG AAG CCT GCT TTT TTA TACTAA CTT GAG CG-3′) were annealed and inserted into BglIIdigested-pNOV2790 to form pQD187A8. The phage lambda attP sequence wasamplified by PCR from the phage DNA with two primers, ATTPSPOMI (SEQ IDNO:92: 5′-GGG CCC TCT GTT ACA GGT CAC TAA TAC CAT CTA AG-3′) andATTPSPEI (SEQ ID NO:93: 5′-ACT AGT GAA ATC AAA TAA TGA TTT TAT TTTG-3′), and the PCR product was cloned into the pCR2.1-TOPO vector toform pNOV5088. The attP sequence was removed from pNOV5088 by digestionwith ApaI, treatment with a Klenow fragment, and then a second digestionwith NotI. pNOV5089 was digested with XbaI, filled-in with a Klenowfragment, and then cut with NotI. The above XbaI/NotI fragment ofpNOV5089 was then ligated with the ApaI/NotI fragment of pNOV5088 toform pNOV5094. pNOV5089 was derived from pNOV5044 by replacing theBstBI/SnaBI fragment of the GUSBAFintron with the BstBI/SnaBI fragmentof the GUSintron from pNOV3603. The KpnI/SpeI fragment of pNOV5031 wasreplaced with the KpnI/SpeI fragment from pNOV5087 to form pNOV5095.pNOV5094 was cut with Ecl136II and PspOMI to isolate the 3517 byfragment containing the CMPS promoter::GIG::Tact2::attP site. pNOV5095was digested with PacI, blunted with a Klenow treatment, and thenpartially recut with PspOMI to isolate the 13805 by fragment. TheEcl136II/PspOMI fragment of pNOV5094 was inserted into the above 13.8 kbfragment of pQD195A6 to form a third monocot donor vector, pNOV5096(FIG. 17B). The 3608 by Ecl136II/PspOMI of pNOV5098 was inserted intothe above PacUPspOMI-digested pNOV5095 vector to form a fourth monocotdonor vector, pQD200C6 (FIG. 17B), which inlcuded RS and FRT sitesupstream of the ZmUbi promoter in addition to the attP sequence. Abinary vector (pNOV5099) containing a positive control PMI-intron genewith the attB sequence in the first intron was constructed by insertingthe 3551 by BamHI fragment of pQD187A8 into BamHI-digested pNOVO41.Another positive control binary vector (pQD203A11) was created byinserting the NcoI(blunt)/PspOMI fragment (3.6 kb) of pNOV5098 into a(PacI)blunt/PspOMI fragment (11535 bps) of pNOV5099.

Example 24 Construction of I-CeuI Expression Vectors for Monocots

An I-CeuI sequence with maize-preferred codons was released frompSmasICeuIintron as a BamHI/KpnI fragment (1154 bps), and a maizeubiquitin promoter (ZmUbi) was released from pNOV2115 as a BamHI/HindIII(2005 bps) fragment. These two fragments (I-CeuI and ZmUbi) were ligatedinto KpnI/HindIII-digested pNOV2114 to form pNOV5033. pNOV2114 is abinary backbone vector with a VS1 origin, one copy of the VirG gene, anda spectinomycin resistance gene for selection in bacteria. The BamHUKpnIfragment of I-CeuI and the ZmUbi fragment were also ligated withKpnI/hindIII-digested pNOV2122 to form pNOV5034. pNOV2122 is a binarybackbone vector with an RK2 origin of replication, one copy of the VirGgene, and a kanamycin resistance gene for selection in bacteria. In bothpNOV5033 and pNOV5034, I-CeuI expression was under the control of amaize ubiquitin promoter.

Example 25 Generation of Target Maize Plants

Target plants can be generated through Agrobacterium orbiolistic-mediated transformation using target vector pNOV5025 andpAdF55 with any of several monocot plants, such as maize, rice, wheat,or barley, for example. Maize examples are provided here to demonstratethe feasibility of gene targeting through homologous recombination inmonocot plants. The transformation of immature maize embryos wasperformed essentially as described in Negrotto et al. (2000 Plant CellReports 19: 798-803), which describes the use of PMI as the selectablemarker gene and mannose as the selection agent, and Li et al (2003 PlantPhysiol. 133:736-747), which describes the use of PPO as the selectablemarker gene and butafenacil as the selection agent. For this example,all media constituents are as described in Negrotto et al. and Li et al.supra. However, various media constituents described in the literaturemay be substituted.

Target binary vector pNOV5025 contained the mutant protoporphyrinoxidase (protox) (PPO) gene (U.S. Pat. No. 6,308,458), which permittedthe selection of transgenic cells with an herbicide-supplemented media(i.e., butafenacil). See, Li et al. 2003 Plant Physiol. 133:736-747. Apositive control vector pNOV5026 was also included.

Agrobacterium strain LBA4404 (pSB1) containing pNOV5025 was grown on YEP(yeast extract (5 g/L), peptone (10 g/L), NaCl (5 g/L), 15 g/l agar, pH6.8) solid medium for 2-4 days at 28° C. Approximately 0.8×10⁹Agrobacteria (about 0.75 A₆₆₀) per ml were resuspended in LS-inf mediasupplemented with 100 μM As (Negrotto et al. 2000 Plant Cell Rep 19:798-803). Bacteria were pre-induced in this medium for 30-60 minutes.For this example, immature embryos from A188×Hi II were excised from8-12 day old ears into liquid LS-inf+100 μM As. However, immatureembryos derived from various other crosses or selfed A188 or HiII plantscan be used as transformation targets. The embryos were rinsed once withfresh infection medium and heat-shocked at 45° C. for 5 minutes. Theinfection medium was replaced with Agrobacterium solution, and theembryos were vortexed for 30 seconds and allowed to settle with thebacteria for 5 minutes. The embryos were then transferred scutellum sideup to LSAs medium and cultured in the dark for two to three days.Subsequently, between 20 and 25 embryos per petri plate were transferredto LSDc medium supplemented with ticarcillin (250 mg/l) and silvernitrate (1.6 mg/l) and cultured in the dark for 28° C. for 10 days.

Selection was performed essentially as described in Li et al., supra.Silver nitrate was used in both the initiation and selection media, andsucrose was used at 30 g/L. The protox inhibitory herbicide butafenacilwas added to the media at 5 nM for initiation and primary selection, 500nM for second selection, and 750 nM for the final selection.Regeneration 1 was carried out on media supplemented with 50 nMherbicide with no herbicide selection in subsequent regeneration media.Maize leaf sample were assayed by Taqman analysis for the copy number ofPPO and PMI genes. Maize events (for example, AW286B1A to AW289B1C,AW289B1A to AW289B1C, AW289E2D and AW289F2C etc.) with a single copy ofboth genes were transplanted into soil and grown in the greenhouse.

Example 26 Targeted Integration of a Donor Sequence by HomologousRecombination in Maize

Once the primary transgenic lines containing the desired T-DNA target(i.e., a target containing single copies of both the PPO and PMI genes)are obtained, various materials derived from these plants and theirprogeny can be used as target tissue for retransformation to obtaintargeted events. These materials can also be used as pollen donors orreceptors to produce target tissues for retransformation. For thisexample, AW289B1A was either selfed or used to pollinate A188 or HiII toproduce seeds. Pollen from AW289B1A also was used directly to pollinateuntransformed A188 and to generate immature embryos for retransformationwith donor vectors pNOV5031, pNOV5045, pNOV5096, and pQD200C6.

Immature embryos (7-10 days post-pollination) were isolated fromimmature ears and used for Agrobacterium-mediated transformation, asdescribed by Negrotto et al., supra. In some experiments, anAgrobacterium culture containing an I-CeuI expression vector, pNOV5033,was mixed with an Agrobacterium strain containing the donor vector (in a1:1 ratio). Targeted events were selected from Agrobacterium-infectedimmature maize embryos using mannose as a selection agent. Immatureembryos producing embryogenic calli were transferred to LSD1M0.5Smedium. The cultures were selected on this medium for 3 weeks,transferred to fresh LSD1M0.5S medium, and then incubated for another 3weeks. Surviving calli were transferred to Reg1 medium supplemented withmannose. Following a culture period of 1 to 2 weeks in the light (16hour light/8 hour dark regimen), green tissues were then transferred toReg2 medium without growth regulators and then incubated for 1-2 weeks.Plantlets were transferred to Magenta GA-7 boxes (Magenta Corp, ChicagoIll.) containing Reg3 medium and grown in the light (16 hour light/8hour dark regimen). After 2-3 weeks, plants were transferred to thegreenhouse for planting in soil. Maize lines HR-18FB.1A to HR-18FB.1Nare putative mannose resistant targeted lines. These lines were derivedfrom the targeted integration of donor sequence pNOV5045 by homologousrecombination in the presence of pNOV5033.

Example 27 Molecular Characterization of the Targeted Event

Putative mannose-resistant targeted events were confirmed by well-knownmolecular biological methods, including PCR and Southern blot analysis.For example, a Southern blot was prepared from the DNA of target lineAW289B1A and putatively targeted line HR-18FB.1M. DNA samples weredigested with various restriction enzymes, including KpnI, ScaI, Sad,SpeI and HpaI, and hybridized with two different target specific probesfrom 5′-region of the rice actin 1 promoter and 3′-region of thePMlintrons (see FIGS. 20A and 20B for the blot and FIGS. 19A and 19B forthe probe location and restriction map). The hybridization patterns wereconsistent with targeted double crossover recombination of pNOV5045T-DNA with the target locus, which included T-DNA from pNOV5025.

The first target locus-specific probe (i.e., the 5′-region of the riceAct1, FIG. 19B) is from the rice actin 1 promoter 5′-upstream regionthat is not present in the donor vector pNOV5045 and is used to detectrecombination at the LB end of the target locus. The second targetlocus-specific probe (i.e., PMlintrons 3′-region) hybridizes to theregion containing the PMlintrons intron 4/exon 5 and is used to detectrecombination at the RB end of the target locus (FIG. 19A).

Southern analysis confirmed that HR-18FB.1M is a truly targeted eventderived from AW289B1A (FIGS. 19A to 19D and 20A to 20D). Sad digestionof the DNA samples would be expected to release an internal fragmentfrom the target locus that included most of the introduced T-DNAsequences in both the target locus and the expected recombinant, but thesize of the Sad band hybridizing to the PMI 3′-end probe would beexpected to shift from 5.4 kb to 11.5 kb. As predicted, the size of theSad fragment shifted from 5.4 kb of the target locus (AW289B1A) to 11.5kb of the putative targeted line HR-18FB.1M when the PMI-intron3′-fragment was used as a probe (FIG. 20B, lane 1a vs 6a).

KpnI digestion of the DNA samples resulted in a KpnI fragment that wasalso decreased in size, as predicted, from about 8 kb in the targetlocus to 3.5 kb in targeted line HR-18FB.1M when the PMI-intron 3′fragment was used as a probe (FIG. 20B, lane 3a vs 8a).

With ScaI, SpeI, and HpaI digestions, the fragment sizes did not changeas predicted when there was targeted integration at the target locus(FIG. 20B, lane 2a vs 7a, lane 4a vs 9a, lane 5a vs 10a). Because ScaIand HpaI digestions hybridized with a PMI-intron 3′-probe detectedchanges in sequences outside the T-DNA, the results indicated that noDNA rearrangement could be detected on the right border of the T-DNAlocus.

With the rice actin-1 5′-region probe, all five digestions indicatedthat recombination had occurred on the PPO side of the target locus.With the exception of SpeI, all band shifts in the targeted lineHR-18FB.1M as compared with the target line AW289B1A were as expected(FIG. 20B). Because SpeI digestion is sensitive to overlapping cytosinemethylation, it is possible that the SpeI site between the PPO and GUSgenes was methylated. If this were the case, the size of SpeI fragmentwould have been expected to increase to 13 kb rather than be reduced to6 kb. Since the SpeI fragment detected by the rice actin-1 probe was infact 13 kb, methylation was the likely cause of the band shift (FIG.20B, lane 4b vs 9b). It is also possible that there was a rearrangement,such as a deletion, that lead to the loss of the SpeI site.

Overall, the Southern blot data are consistent with the occurrence oftargeted integration of the donor T-DNA into the target locus in lineHR-18FB.1M by double crossover homologous recombination.

Example 28 Preparation of a Site-Specific R Recombinase NucleotideSequence with Maize-Preferred Codons and Construction of an RRecombinase Expression Vector

A site-specific R recombinase amino acid sequence was back-translatedinto a DNA nucleotide sequence as shown in SEQ ID NO:94 usingmaize-preferred codons (U.S. Pat. No. 6,121,014). Sequences flanking thesynthetic R recombinase (ZmR) coding region are indicated in lower casesletters.

SEQ ID NO: 94: R recombinase with maize-preferred codons (ZmR)ctcgagcaaccATGCAGCTGACCAAGGACACCGAGATCAGCACCATCAACCGCCAGATGAGCGACTTCAGCGAGCTGAGCCAGATCCTGCCCCTGCACCAGATCAGCAAGATCAAGGACATCCTGGAGAACGAGAACCCCCTGCCCAAGGAGAAGCTGGCCAGCCACCTGACCATGATCATCCTGATGGCCAACCTGGCCAGCCAGAAGCGCAAGGACGTGCCCGTGAAGCGCAGCACCTTCCTGAAGTACCAGCGCAGCATCAGCAAGACCCTGCAGTACGACAGCAGCACCAAGACCGTGAGCTTCGAGTACCACCTGAAGGACCCCAGCAAGCTGATCAAGGGCCTGGAGGACGTGGTGAGCCCCTACCGCTTCGTGGTGGGCGTGCACGAGAAGCCCGACGACGTGATGAGCCACCTGAGCGCCGTGCACATGCGCAAGGAGGCCGGCCGCAAGCGCGACCTGGGCAACAAGATCAACGACGAGATCACCAAGATCGCCGAGACCCAGGAGACCATCTGGGGCTTCGTGGGCAAGACCATGGACCTGATCGAGGCCCGCACCACCCGCCCCACCACCAAGGCCGCCTACAACCTGCTGCTGCAGGCCACCTTCATGAACTGCTGCCGCGCCGACGACCTGAAGAACACCGACATCAAGACCTTCGAGGTGATCCCCGACAAGCACCTGGGCCGCATGCTGCGCGCCTTCGTGCCCGAGACCAAGACCGGCACCCGCTTCGTGTACTTCTTCCCCTGCAAGGGCCGCTGCGACCCCCTGCTGGCCCTGGACAGCTACCTGCAGTGGACCGACCCCATCCCCAAGACCCGCACCACCGACGAGGACGCCCGCTACGACTACCAGCTGCTGCGCAACAGCCTGCTGGGCAGCTACGACGGCTTCATCAGCAAGCAGAGCGACGAGAGCATCTTCAAGATCCCCAACGGCCCCAAGGCCCACCTGGGCCGCCACGTGACCGCCAGCTACCTGAGCAACAACGAGATGGACAAGGAGGCCACCCTGTACGGCAACTGGAGCGCCGCCCGCGAGGAGGGCGTGAGCCGCGTGGCCAAGGCCCGCTACATGCACACCATCGAGAAGAGCCCCCCCAGCTACCTGTTCGCCTTCCTGAGCGGCTTCTACAACATCACCGCCGAGCGCGCCTGCGAGCTGGTGGACCCCAACAGCAACCCCTGCGAGCAGGACAAGAACATCCCCATGATCAGCGACATCGAGACCCTGATGGCCCGCTACGGCAAGAACGCCGAGATCATCCCCATGGACGTGCTGGTGTTCCTGAGCAGCTACGCCCGCTTCAAGAACAACGAGGGCAAGGAGTACAAGCTGCAGGCCCGCAGCAGCCGCGGCGTGCCCGACTTCCCCGACAACGGCCGCACCGCCCTGTACAACGCCCTGACCGCCGCCCACGTGAAGCGCCGCAAGATCAGCATCGTGGTGGGCCGCAGCATCGACACCAGCTGAagctt

This synthetic R recombinase with maize preferred codons was synthesizedand cloned into pUC19 to form pUC19-ZmR by IDT (Coralville, Iowa 52241).A ZmUbi-R expression cassette was inserted into binary vector pNOV2114for maize transformation. A ZmR HindIII/BamHI fragment (1493 bp) wasthen removed from pUC19-ZmR by HindIII digetion, filled-in with a Klenowfragment, and then digested with BamHI and inserted into pNOV3603, whichthen was cut with Sad, blunted with a Klenow fragment, and digested withBamHI to form pQD204B1. pQD204B1 included the maize ubiquitin promoterto drive expression of ZmR, which was followed by a nopaline synthaseterminator. The HindIII/KpnI fragment (3784 bp) of pQD204B1 containingthe ZmUbi promoter::ZmR::Tnos cassette was inserted into aHindIII/KpnI-digested pNOV2114 binary backbone vector to form pQD205A1.pQD204B1 was also digested with KpnI, blunted by treatment with a Klenowfragment, and then recut with HindIII to isolate the 3780 byKpnI/HindIII fragment containing the ZmUbi promoter::ZmR::Tnosexpression cassette. This KpnI/HindIII fragment was inserted intopNOV2819, which was cut with SalI, filled-in with a Klenow fragment, andre-digested with HindIII to form binary vector pQD206B1. pQD206B1contained a ZmR expression cassette (ZmUbi promoter::ZmR::Tnos) and aselectable marker gene cassette (CMPS promoter::PMI::Tnos). ZmR was alsoplaced under the control of several tissue specific promoters, includingOsG, RA-8, P19, and OsMADS13 to avoid any potentially undesirableeffects of constitutive expression. These vectors were referred to aspBSC11475 (OsG), pBSC11478 (RA-8), pBSC11479 (P19), and pBSC11480(OsMADS13), respectively.

Example 29 Construction of Binary Vectors for Expressing Phage LambdaIntegrase, an Integrase Mutant, and an Integration Host Factor

Phage lambda integrase, its double amino acid mutant (IntH218), and hostfactors with maize preferred codons are described in WO/03083045. Binaryvector pNOV2114IntIHFs contained maize-optimized lambda integrase (Int)and IHF α and β coding sequences under the control of a CMPS promoterfollowed by a Tnos terminator. The (HindIII)blunt/AscI fragment (4122bp) containing the Int and IHF expression cassettes were removed frompNOV2114IntIHFs by HindIII digestion, filled-in with a Klenow treatment,recut with HindIII, and ligated with a (BamHI)blunt/AscI fragment (9541bps) of pWC057 to form pQD208B12. pWC057 is a binary vector containing aZmUbi promoter::AtPPO(dm)::T35S expression cassette (see U.S. Pat. No.6,282,837). pQD208B12 is a binary transformation vector containing theCMPS promoter::Int::Tnos, CMPS promoter:IHFα::Tnos, and CMPSpromoter::IHFβ::Tnos expression cassettes, as well as the ZmUbipromoter::AtPPOdm::T35S selectable marker cassette. Similarly, binaryvector pNOV2114IntH218IHFs contains a maize-optimized lambda integrasemutant (IntH218) and IHF α and β coding sequences under the control of aCMPS promoter followed by a Tnos terminator. The (HindIII)blunt/AscIfragment (4122 bp) containing the IntH218 and IHF expression cassetteswere removed from pNOV2114IntH2181HFs by HindIII digestion, filled-inwith a Klenow treatment, recut with HindIII, and ligated with a(BamHI)blunt/AscI fragment (9541 bps) of pWC057 to form pQD209B16.pQD209B16 is a binary transformation vector containing the CMPSpromoter::IntH218::Tnos, CMPS promoter::IHFa::Tnos, CMPSpromoter::IHFβ::Tnos expression cassettes, as well as the ZmUbipromoter::AtPPOdm::T35S selectable marker cassette. Plasmid vectorpAdF62A (WO03/083045), containing the synthetic XIS gene with maizeoptimised codons, was cut with SpeI, filled-in with Klenow, and thenre-cut with AscI to isolate the SpeI-AscI fragment containing the CMPSpromoter, XIS gene, and nos terminator. This fragment was inserted intoAscI/SwaI-digested pQD208B12 and pQD209B16 to form pQD350A7 (aka.pBSC11348) and pQD351A15 (aka. pBSC11349), respectively.

Example 30 Generation of Transgenic Plant Lines Expressing ZmR, IntIHFs,and IntH2181HFs

Binary vectors pQD206B1, pQD208B12, pQD209B16, pBSC11348, pBSC11349,pBSC11475, pBSC11478, pBSC11479, and pBSC11480 were each transformed,individually, into Agrobacterium strain LAB4404(pSB1). The individualcultures of the Agrobacterium strain were then used for co-cultivationwith immature maize embryos. The co-cultivated embryos were placed on aselection medium containing an herbicide (butafenacil) to generatetransgenic plants. The transgenic plants were crossed directly to targetplants or they were self-pollinated to produce seeds, which were used togenerate additional plant material to cross with other plants.

Example 31 Removal of the Promoter and Part of the PMI-Intron Sequenceto Regenerate a Truncated PMI-Intron Sequence

Transgenic maize lines expressing either synthetic R recombinase orphage lambda integrase were obtained by Agrobacterium-mediatedtransformation using binary pQD208B12, pQD209B16, pBSC11348, pBSC11349,pBSC11475, pBSC11478, pBSC11479, and pBSC11480. R recombinase orintegrase-expressing lines can be crossed with desirable targetedrecombinants to excise both the promoter sequence and the region of thePMI coding sequence flanked by the RS, attB/attP, or attL/attR sequencesto truncate the PMI selectable marker gene. The progeny are screened byPCR for the truncation. Lines with the truncated sequence arebackcrossed with a non-transgenic parent line to produce seeds. Theseseeds are then germinated, and the seedlings are screened by PCR torecover lines with the desired truncated sequence but without therecombinase locus. Lines with a regenerated target site but without therecombinase gene are used for a second round of gene targeting.

Alternatively, recombinant lines can be re-transformed with either an Rrecombinase or a Lambda integrase expression vector. Transformed linesare screened by PCR for the desired deletion. Lines with the desireddeletion are backcrossed with untransformed plants to obtain seeds.These seeds are then germinated, and the seedlings are screened by PCRto recover lines with the desired deletion but without the R recombinaseor Lambda integrase locus. Lines with a regenerated target site butwithout the R recombinase or integrase gene are used for a second roundof gene targeting.

Recombinase can also be delivered as a virE2/VirF fusion proteinexpressed by Agrobacterium (Vergunst et al. 2000 Science 290:979-82).Maize tissues, preferably immature embryo or embryogenic callus, areinfected with Agrobacterium cells containing vectors expressingR/integrase::virE2/virF fusion proteins. These fusion proteins aretransported into plant cells to mediate a site-specific deletion of thesequence flanked by recombinase recognition sequences in suitableorientation allowing excision of the flanked region, such as Lox, FRT,RS, attB/attP or attL/attR sequences. Regenerated plants are screened byPCR for the deletion. With this method, recombinase or integraseexpression vector DNA is delivered into the plant cells. Lines with thedesired deletion can be used directly for an additional round of genetargeting.

Example 32 Generation of Target Rice Plants

For this example, the rice (Oryza sativa var. javonica) cultivar“Kaybonnet” was used to generate a target rice plant. However, otherrice cultivars also can be used (Hiei et al. (1994) Plant Journal6:271-282; Dong et al. (1996) Molecular Breeding 2:267-276; Hiei et al.(1997) Plant Molecular Biology 35:205-218). Also, various mediaconstituents described below may be varied or substituted.

Embryogenic responses were initiated and/or cultures were establishedfrom mature embryos by culturing on MS-CIM medium (MS basal salts, 4.3g/liter; B5 vitamins (200×), 5 ml/liter; Sucrose, 30 g/liter; proline,500 mg/liter; glutamine, 500 mg/liter; casein hydrolysate, 300 mg/liter;2,4-D (1 mg/ml), 2 ml/liter; adjust pH to 5.8 with 1 N KOH; Phytagel, 3g/liter). Either mature embryos at the initial stages of cultureresponse or established culture lines were inoculated and co-cultivatedwith the Agrobacterium strain LBA4404 containing the desired vectorconstruction (i.e., pNOV5025 or pADF55).

Agrobacterium was cultured from glycerol stocks on solid YP medium (100mg/L spectinomycin) for 3 days at 28° C., then streaked again andcultured for 1-2 days. Agrobacterium was re-suspended in liquid MS-CIMmedium. The Agrobacterium culture was diluted to an OD600 of 0.2-0.3 andacetosyringone was added to a final concentration of 200 uM.Agrobacterium was induced with acetosyringone for at least 30 min beforemixing the solution with the rice cultures.

For inoculation, the cultures were immersed in the bacterial suspensionfor 30 min. The liquid suspension was removed with a vacuum aspirator,and the inoculated cultures were placed on a Whatman paper filter onco-cultivation medium MS-CIM-As (MS-CIM with 200 uM acetosyringone) andincubated at 22° C. for two days. The cultures were then transferred toMS-CIM medium with ticarcillin (400 mg/liter) to inhibit the growth ofAgrobacterium. For pNOV5025, a protox inhibitory herbicide (e.g., CGA856,276 or butafenacil) (U.S. Pat. No. 6,282,837) was used forselection. Cultures are transferred to selection medium containingcompound CGA 856,276, MSI/856,276 (MS-CIM with 1000 nM butafenacil, 200mg/liter timentin) after 14 days and cultured for 28 days in the dark.Resistant colonies were then transferred to regeneration inductionmedium (MS-CIM with no 2,4-D, 0.5 mg/liter IAA, 1 mg/liter zeatin, 200mg/liter timentin, and butafenacil) and grown in the dark for 14 days.Proliferating colonies were then transferred to another round ofregeneration induction media and moved to the light growth room.Regenerated shoots were transferred to GA7-1 medium (MS withouthormones) for 2 weeks and then moved to the greenhouse when they werelarge enough and had adequate roots. Plants were transplanted to soil inthe greenhouse and grown to maturity. For pADF55, a similar protocol wasused to generate transgenic plants, except that hygromycin, rather thanbutafenacil, was used as the selection agent.

Example 33 Targeted Integration of a Donor Construct into Target RiceLines

Primary transgenic target rice lines, preferably single copy lines,containing T-DNA from target vector pNOV5025 or pADF55 wereself-pollinated to obtain seeds. Seeds from selfed progeny of theselines were also used for establishing embryogenic cultures andsuspension cultures for targeting experiments. Immature embryos fromyoung seeds or mature embryos from dry seeds are used to establishembryogenic cultures (Hiei et al. 1994 Plant Journal 6:271-282; Dong etal. 1996 Molecular Breeding 2:267-276; Hiei et al. 1997 Plant MolecularBiology 35:205-218). These cultures or suspension cell clusters are thenused for Agrobacterium-mediated transformation.

Agrobacterium strain LBA4404 containing the targeting donor vectorpQD200C6 or pAdF77 was used for generating targeted events from targetlines derived from pNOV5025 or pAdF77, respectively. Other targetingvectors using flanking genomic sequences as a region of homology canalso be designed and used. In this case, the length of homology could beincreased or decreased, as needed, and the selectable marker genesequences used to introduce the target sequence can be replaced.Targeted events were selected from Agrobacterium-infected riceembryogenic cultures using the selection and regeneration processesdescribed above, with the exception that 2% mannose is used as aselection agent. Two target lines (RITI2001001226A1A andRITI2001001226A5A, referred later to as lines A1A and A5A) containing 2copies of T-DNAs were used for targeting study with donor vectorpQD200C6. Both lines have 2 copies of T-DNA inserted in the genome,mostly likely at unlinked positions. Two lines derived from pAdF55(AdF55-15A and AdF55-35A) were aslo randomly selected for gene targetingstudy with donor vector pADF77. Callus or suspension cell cultures wereinitiated from mature seeds of target plants and were co-cultured withAgrobacterium cells containing donor vector. Co-cultivations were alsodone with mixture of two Agrobacterium strains, one containing donorvector and another containing pNOV5033, the expression vector ofmega-endonuclease 1-CeuI. Co-cultured calli were selected in mannosecontaining medium to recover targeted events. Mannose resistant calluscan be seen within a month after selection. Resistant calli wereregenerated into plants. A PCR assay using two primers (PMIExFW1 andPMIExRV5) was used to confirm whether the mannose resistant plantsindeed contained a full-length recombinant PMI-intron sequence. Onlyplants that are derived from recombination between target and donor'struncated PMI-intron gene sequence produce a PCR product of 3.5 Kb. Mostof the recovered eventswere tested positive using this assay, suggestingthat mannose selection is very effective in recovering targeted eventsin rice. Co-delivery of I-CeuI expression vector pNOV5033 with donorvector increased the number of targeted events, especially for line A1A.For two target lines derived from pAdF55 (AdF55-15A and AdF55-35A), alltargeted events were obtained when the donor vector was co-deliveredwith the I-CeuI endonuclease expression vector (Table 3).

TABLE 3 Targeted integration of a donor into target rice lines Donor:No. of I-CeuI Tissue mannose Target Target vector fresh resistant vectorline Target tissue Exp. ID. ratio * wt. (g) events pNOV5025 A5ASuspension 664.154 1:0 0.96 2 culture 1:1 1.13 0 cells 1:½ 1.09 0 fromT2 1:⅕ 1.08 1 seeds Pos. ctrl 1.13 64 pNOV5025 A5A Suspension 664.1651:0 1.10 0 culture 1:1 1.15 2 cells 1:½ 1.58 0 from T2 1:⅕ 1.31 1 seedsPos. ctrl 1.13 55 pNOV5025 A5A, T2 Calli from 664.162 1:0 1.18 0 T2seeds 1:1 1.26 2 1:½ 1.31 1 1:⅕ 1.34 0 Pos. ctrl 1.15 15 pAdF55 15A, T1Calli from 664.151 1:0 2.08 0 T1 seeds 1:1 1.98 1 1:½ 1.98 2 1:⅕ 1.98 1pAdF55 35A, T1 Calli from 664.151 1:0 1.96 0 T1 seeds 1:1 3.1 5 1:½ 1.90 1:⅕ 2.07 0 Pos. ctrl 1.93 49 * Note: Donor vectors used were pQD200C6and pAdF77 for target lines derived from pNOV5025 and pAdF55respectively. For I-CeuI endonuclease expression, vector pNOV5033 wasused; Pos. ctrl: positive control, pNOV2147 was used for estimatingoverall transformation (random integration) efficiency.

Example 34 Suppression or Down-Regulation of RecQ Homologs to EnhanceGene Targeting Efficiency

1. Identification of RecQ Homologs in a Plant Genome

Plant genomic and cDNA sequence databases can be searched with variousbioinformatics programs to identify bacterial, yeast, and animal RecQhomologs. For example, the Arabidopsis genome contains several RecQhomologs (Hartung et al. 2000 Nucleic Acids Res. 21, 4275-4282). Toidentify RecQ homologs in the rice genome, proprietary Syngenta ricegenome (Myriad contigs V8, Nipponbare cultivar) sequences were searchedwith the TBLASTN program using the E. coli RecQ protein sequence(GenBank accession number: NP_(—)756603) and the mouse RecQ-like protein(GenBank accession number: BC014735) as queries. Two contigs (CLB1350.2,CLB5120.2) produced a high score (517, E value=e-145). Another threecontigs (CL003142.76, CL027228.91, and CLC370) produced lower butsignificant scores. Gene prediction programs (Fgenesh, Genscan, andGenmark) were used to predict the open reading frame of each hit

Primers were designed for amplification of the cDNA. OsRecQcfw2 (SEQ IDNO:95: CAC CAT GAA GCA CGG TGT AAT TGA TGA TAA AGA A) and OsRecQcRv1(SEQ ID NO:96: TCA AGA GGG AAT CTT TAT GCA GTT GTC GCA) amplified a cDNAof 2 kb (OsRecQB) from rice (Oryza sativa, cultivar Kaybonnet) youngflowers. OsRecQdFW2 (SEQ ID NO:97: CAC CAT GAT AAA GCC AAG GGT CAA CTGGTC GGA T) and RecQdRV1 (SEQ ID NO:98: CTA GGC TAT TCT GGC GGA CTG CCACGC AGG) amplified a cDNA of 3.5 kb (OsRecQA) from rice immatureflowers. The OsRecQB (2 kb) and OsRecQA (3.5 kb) cDNA PCR products werecloned into pENTR-TOPO vector (Invitrogen) to form pQD356A27 andpQD363C8, respectively. The insert of each clone is sequenced. TheOsRecQA cDNA (SEQ ID NO:99) contains an ORF of 3525 bp, which encodes aprotein having 1174 amino acid residues (SEQ ID NO:100). The OsRecQBcDNA (SEQ ID NO:101) contains an ORF of 1419 bps, which encodes aprotein having 472 amino acid residues (SEQ ID NO:102). When the ricegenome open reading frame (ORF) databases (cultivar Nipponbare) weresearched, a third homolog, OsRecQC (SEQ ID NO:103), having 4692 bps wasalso identified. This third homolog encodes a protein having 1563 aminoacid residues (SEQ ID NO:104).

SEQ ID NO: 99: OsRecQA cDNA from Oryza sativa (cultivar Kaybonnet)DEFINITION OsRecQA cDNA SOURCE Young flower. ORGANISMOryza sativa Cultivar Kaybonnet REFERENCE 1 (bases 1 to 3525) AUTHORSQiudeng Que CDS 1 . . . 3525 BASE COUNT 1090 a 736 c 805 g 894 t ORIGIN   1 ATGATAAAGC CAAGGGTCAA CTGGTCGGAT CATGCAAATG CTGTTCAAAG CTCCTGTATC  61 AAAGATGAAT TCCTGAGTTC AAGTTTTTTG TTCTCTTTAC CAACACAAAG GCCTAATCAG 121 GAAGCAGATT GTACGGGAAT GCTTCCTTTA AGGTCTGCTG CTTGCAGAAT TCAAGGCCTA 181 GAGCGTCTTC AAGCTCCATC CATTGAGAAG GCCTGGCGTT CTCTACGCAA CACTCAGGTT 241 GCACGGAAGA ATTATTTAAG ACCTGGTTTA TCTGGAAAAG TGAAAGATTG TGATAGCGAC 301 CATGCTCATA CTTATGGGAC AAGTTCTTCA TATAATGTTA ACAAAGTGGA CAGTGTGTCC 361 AGAAATAGGA ATCCCACCCA GGAAAGTATG CATCAGACGA CTGAAAGTGG TACTATGGAG 421 AAGAACAGTA GCCATCTGCC TGCAGGCACC AAGTCCTGTA CAAGGACTTA CCTGAACAAT 481 CATGTGGTGC AGGCAGATAC CATTACAACA ACAAATCAAA GTCTTGCAAG AACTGGTCCT 541 GAATTATTCA AGACTGCTCC TTTTATTGAC AACATGTGTG ATGATGCTAA ATTAGATGCC 601 ATGGATGAGG ATGAGCTTCT AGCGAGTATT GATGTGGACC GAATAGTCAT GGAACATTAT 661 CAAGCAACAA ATACACCCAG AGGGTCATCC AAATCTCCAT TAGAGAAGTG CAACTTCAAT 721 GGATTTGATG AGAATAATTT ACCACAAGAA CTCTCTATAA TGTGTGACCA CGGTAGCAAG 781 CTAGCTTTTT GCCCAGAGGC GAAGTCTCAT TTGCTTGAAA TGAAGGATAA CTTGCTTGCA 841 ATATCCCATG AGCTTATTGA CGGTCAACTC AGCCCTCAAC AATCTGATGA TCTTCATCAA 901 AAGAGAGCAC TCCTAAAGAA GCAGATTGAG CTGCTTGGGG AGTATACGGC GAGGTTAACC 961 CAAGATGAAG AGCGACAGCA GTCTCATTCT ATGGCCTCCA CAACAGCTCA TCAGGGCCAT1021 CACCCCACTA GCATCCTAAG TAGCTCTTTT GTAAAGGATA CCAATATATT CCGATCACCG1081 ATTTACACCA GGAATGAACC TGGGGAGAGT GGTTTATGCT TTTCTTCTGC TCCATATTCC1141 TATATGGATG GTTTAAGCAT GCCATTACCG TCTGTTCAGA GAGATTACAC TCCAAGGGCT1201 ATTGATATCA GTTACACTGA AGGTTCTGGT GATAAACAGT GGAGTAGTAC ACACTTTGCA1261 TGGACTAAGG AACTCGAGGC CAACAACAAA GGAGTATTTG GAAACCGTTC TTTTCGCCCA1321 AATCAACGAG AAATAACCAA CGCCACAATG AGTGGGAATG ATGTTTTTGT TTTGATGCCA1381 ACTGGTGGTG GAAAAAGTTT GACATATCAG CTTCCAGCAC TCATTTGTAA TGGCGTTACA1441 TTGGTAGTTT CTCCTCTCGT ATCGCTCATC CAAGACCAGA TCATGCATTT ATTGCAGGCA1501 AATATTTCTG CAGCTTACCT TAGCGCCAGC ATGGAGTGGT CAGAACAGCA GGAGATATTA1561 AGAGAATTAA TGTCTCCTAC ATGCACGTAC AAGTTACTGT ATGTTACGCC TGAAAAGATA1621 GCCAAGAGTG ATGCTCTGTT GAGACAATTG GAAAATTTAT ATTCGCGAGG CCATCTCTCT1681 AGAATTGTCA TTGATGAAGC CCACTGTGTT AGCCAGTGGG GTCATGATTT CCGACCTGAT1741 TACCAGCATC TAGGCATTTT AAAACAGAAG TTCCCGCAGA CGCCGGTCCT GGCCTTGACA1801 GCAACAGCAA CTGCAAGTGT CAAGGAAGAT GTCGTGCAAG TTCTAGGCCT TGCAAACTGC1861 ATTATTTTCA GACAAGGTTT TAATCGTCCA AATCTGAGGT ATTTTGTATG GCCCAAGACA1921 AAGAAGTGCC TCGAGGATAT CCATAACTTT ATACATGCAA ATCATAATAA AGAATGCGGC1981 ATCATATATT GCCTTTCGAG GATGGATTGT GAGAAAGTGG CTGCTAAATT AAGGGAATAT2041 GGGCACCAGG CATCACATTA TCATGGTAGC ATGGATCCTG AGGATAGAGC AAATATCCAG2101 AAACAGTGGA GCAAGGATAG GATCAACATA ATATGTGCTA CAGTTGCATT TGGGATGGGT2161 ATTAATAAAC CTGATGTCCG TTTTGTTATC CATCATTCCC TGCCCAAATC AATTGAAGGA2221 TATCATCAGG AGTGTGGACG TGCTGGTCGT GACAGTCAGC TTTCATCTTG TGTCCTGTTC2281 TACAATTATT CTGATTATAT TCGTCTCAAA CACATGGTTA CCCAAGGATT TGCGGAGCAA2341 GGAACATCAG CACCACGAGG AGGTTCTTCG CAGGAACAAG CGCTTGAAAC GCATAAGGAA2401 AATCTCCTGC GAATGGTTAG TTACTGCGAA AATGATGTGG ACTGCAGACG TCTACTACAG2461 CTGATCCACT TTGGAGAGAT GTTTAATCCT TCATGTTGTG CAAAAACATG TGATAATTGC2521 TTGAAAGAGT TGAGATGGGT CAAAAAAGAT GTGACCAACA TTGCTAGACA ATTGGTTGAT2581 CTGGTAATGA TGACAAAGCA AACATATTCA ACTACTCATA TTCTCGAAGT ATACAGAGGT2641 TCAGTAAACC AAAATGTCAA GAAGCACCGC CATGATACTT TGAGTCTTCA TGGAGCTGGA2701 AAGCATCTAG CTAAAGGTGA AGCAGCGAGA ATATTGCGCC ATCTAGTAAT TGAGGAAATA2761 CTCATTGAGG ATGTCAAAAA GAGCGAAAAC TATGGATCTG TATCATCTGT CTTAAAGACT2821 AATCATAAGA AAAGTGGTGA TCTTCTCTCT GGCAAGCACA ACGTTGTCCT CAAGTTCCCC2881 ACTCCTGAGA AGGCTCCTAA GATGGGTGTA CTCGATGAAT CGTCAGTTCC ACGAATTAAT2941 AAGACTAATC AACAGAGTCA AGTGGACGGG AGCCTTGCAG CCGAGCTTTA TGAAGCTTTG3001 CAATGCCTTA GGACTCAGAT AATGGATGAA AATCCACAAT TATTGGCATA CCACATATTT3061 AAAAACGAGA CATTGAAGGA AATCAGCAAC CGAATGCCAA GAACGAAAGA GGAACTTGTG3121 GAGATAAATG GCATCGGCAA GAACAAGCTG AACAAGTACG GGGACCGCGT GCTTGCAACC3181 ATAGAGGATT TCCTCGCCAG ATATCCAAAT GCGACCAGGA AAACCAGCAG CGGCGGCAGC3241 AACGAGCACA GCGAGGCGGT CAAGAAGCGA AGAGGCTTCT CCGTCACCAA CACCTCTACC3301 AACTGTGACG ACTTTGAGGA ACGCACGGTC CAGTCCAAGA AACGCGCTGC AAAGACACGT3361 ACAAGGCAGG AAATATCTGA TGCTGCCAGC ATCGTCCAGG ACGTCCGCTA CATAGATCTT3421 GAGCTAGATG GTTGTGAACA AGTCAATGAA GTGCCATACA GTGTACAAAA GCCTGTGGCT3481 TCTGGTAGGG TTTTACCTGC GTGGCAGTCC GCCAGAATAG CCTAG //SEQ ID NO: 100: Predicted OsRecQA protein sequence DEFINITIONPredicted OsRecQA protein sequence, 1174 amino acid residues SOURCEYoung flower. ORGANISM Oryza sativa Cultivar KaybonnetMIKPRVNWSDHANAVQSSCIKDEFLSSSFLFSLPTQRPNQEADCTGMLPLRSAACRIQGLERLQAPSIEKAWRSLRNTQVARKNYLRPGLSGKVKDCDSDHAHTYGTSSSYNVNKVDSVSRNRNPTQESMHQTTESGTMEKNSSHLPAGTKSCTRTYLNNHVVQADTITTTNQSLARTGPELFKTAPFIDNMCDDAKLDAMDEDELLASIDVDRIVMEHYQATNTPRGSSKSPLEKCNFNGFDENNLPQELSIMCDHGSKLAFCPEAKSHLLEMKDNLLAISHELIDGQLSPQQSDDLHQKRALLKKQIELLGEYTARLTQDEERQQSHSMASTTAHQGHHPTSILSSSFVKDTNIFRSPIYTRNEPGESGLCFSSAPYSYMDGLSMPLPSVQRDYTPRAIDISYTEGSGDKQWSSTHFAWTKELEANNKGVFGNRSFRPNQREITNATMSGNDVFVLMPTGGGKSLTYQLPALICNGVTLVVSPLVSLIQDQIMHLLQANISAAYLSASMEWSEQQEILRELMSPTCTYKLLYVTPEKIAKSDALLRQLENLYSRGHLSRIVIDEAHCVSQWGHDFRPDYQHLGILKQKFPQTPVLALTATATASVKEDVVQVLGLANCIIFRQGFNRPNLRYFVWPKTKKCLEDIHNFIHANHNKECGIIYCLSRMDCEKVAAKLREYGHQASHYHGSMDPEDRANIQKQWSIDRINIICATVAFGMGINKPDVRFVIHHSLPKSIEGYHQECGRAGRDSQLSSCVLEYNYSDYIRLIHMVTQGFAEQGTSAPRGGSSQEQALETHKENLLRMVSYCENDVDCRRLLQLIHFGEMFNPSCCAKTCDNCLKELRWVKKDVTNIARQLVDLVMMTKQTYSTTHILEVYRGSVNQNVKKHRHDTLSLHGAGKHLAKGEAARILRHLVIEEILIEDVKKSENYGSVSSVLKTNHKKSGDLLSGKHNVVLKFPTPEKAPKMGVLDESSVPRINKTNQQSQVDGSLAAELYEALQCLRTQIMDENPQLLAYHIFKNETLKEISNRMPRTKEELVEINGIGKNKLNKYGDRVLATIEDFLARYPNATRKTSSGGSNEHSEAVKKRRGFSVTNTSTNCDDFEERTVQSKKRAAKTRTRQEISDAASIVQDVRYIDLELDGCEQVNEVPYSVQKPVASGRVLPAWQSARIA // SEQ ID NO: 101:OsRecQB cDNA from Oryza sativa (cultivar Kaybonnet) DEFINITIONOsRecQB cDNA 1419 bp ORGANISM Oryza sativa Cultivar Kaybonnet SOURCEYoung flower REFERENCE 1 (bases 1 to 1419) AUTHORS Qiudeng QueBASE COUNT 427 a 306 c 338 g 348 t ORIGIN    1ATGAAGCACG GTGTAATTGA TGATAAAGAA GTTGAGGTGA GAACTCCTTT GTTCAGACAG   61GCAGAATCCT CTGCTCGACA GACTCGCATC AATCTGGACT CCTTCGGGTT CTCCTCAGAT  121GATGACTTTG AAACGTTGGA GTCCCATTGT GATCGTTCAG TCAGTACCCA GAAGAAGGTG  181AACAGAGGAA ACAATAGATG TGAGTCATCC ACTTCAACTT CAAACAGAGA AACTCTAAGT  241TATCAGCAGC TCAACATGGA CACCTTTGTG CTTATGCCAA CAGGTGGTGG GAAGAGCTTG  301TGTTATCAGC TACCTGCAAC ACTGCACCCA GGTGTTACGG TTGTTGTATG CCCTCTACTG  361TCACTTATTG AGGATCAAAT TGTGGCATTA AACTTCAAGT TTGCTATACC AGCAGCATTT  421TTGAACTCTC AGCAGACACC TTCACAGTCA TCTGCAGTAA TCCAAGAGCT TAGAAGTGGT  481AAACCGTCAT TCAAACTCCT CTACGTCACT CCTGAAAGAA TGGCTGGAAA CAGCTCATTT  541ATTGGGATCC TCATAGGTTT ACACCAGAGG GGTTTACTGG CGAGATTTGT GATTGATGAA  601GCCCATTGTG TAAGTCAATG GGGACATGAC TTCCGCCCAG ATTACCGAGG CCTGGGATGC  661CTCAAACAGA ACTTCCCTCG AGTACCAATT ATGGCTTTAA CAGCTACAGC GACTGCATCT  721GTCTGCAAGG ACATACTAAG TACCTTGAGG ATCCCTAATG CAACGGTACT CAAGAGGAGC  781TTTGACAGAA CAAACCTGAA TTATGAGGTG ATTGGCAAGA CAAAAACTCC ACAGAAGCAG  841CTGGGTGATA TCCTAAAAGA GCGTTTCATG AACATGTCTG GTATCGTGTA CTGTCTGTCC  901AAAAATGAAT GTGCTGACAC TGCCAAGTTC TTGAGGGAGA AGTACAAGAT AAAATGCGCA  961CATTACCACG CTGGCTTGGC TGCTCGTCAA CGATCCAATG TACAAGGAAA ATGGCACAGC 1021GGAGAGGTCA AAGTCATTTG TGCGACCATA GCATTTGGCA TGGGAATAGA CAAACCTGAT 1081GTGCGCTTTG TTATCCACAA CACCATGTCA AAATCGATAG AAAGCTACTA TCAGGAGTCA 1141GGGAGAGCAG GAAGAGACAA TCTTCCGGCA CATTGCATTG TGTTATATCA GAAAAAGGAC 1201CTCGGTCGAA TTGTATGCAT GCTGAGGAAT TCAGGGAACT TCAAGAGTGA GAGCTTCAAG 1261GTTGCAATGG AGCAAGCAAA GAAAATGCAA ACATATTGCG AGCTGAAGAC AGAATGCCGG 1321AGGCAAACTC TTCTTGGCCA CTTCGGTGAG CAGTATGACA GGCAAAGGTG CAAACATGGT 1381TGTAGCCCTT GCGACAACTG CATAAAGATT CCCTCTTGA // SEQ ID No: 102:Predicted OsREcQB protein sequence DEFINITIONOsRecQB protein 472 amino acids ORGANISM Oryza sativa Cultivar KaybonnetMKHGVIDDKEVEVRTPLFRQAESSARQTRINLDSFGFSSDDDFETLESHCDRSVSTQKKVNRGNNRCESSTSTSNRETLSYQQLNMDTFVLMPTGGGKSLCYQLPATLHPGVTVVVCPLLSLIEDQIVALNFKFAIPAAFLNSQQTPSQSSAVIQELRSGIUSFKLLYVTPERMAGNSSFIGILIGLHQRGLLARFVIDEAHCVSQWGHDFRPDYRGLGCLKQNFPRVPIMALTATATASVCKDILSTLRIPNATVLKRSFDRTNLNYEVIGKTKTPQKQLGDILKERFMNMSGIVYCLSKNECADTAKFLREKYKIKCAHYHAGLAARQRSNVQGKWHSGEVKVICATIAFGMGIDKPDVRFVIHNTMSKSIESYYQESGRAGRDNLPAHCIVLYQKKDLGRIVCMLRNSGNFKSESFKVAMEQAKKMQTYCELKTECRRQTLLGHFGEQYDRQRCKHGCSPCDNCIKIPS // SEQ ID NO: 103:OsRecQC cDNA from Oryza sativa (cultivar Nipponbare) DEFINITIONOsRecQC Open Reading Frame 4692 bp DNA SOURCE Oryza sativa cv NipponbareREFERENCE 1 (bases 1 to 4692) CDS 1 . . . 4692 BASE COUNT817 a 1669 c 1511 g 695 t ORIGIN    1ATGGCTTCCC GTCCCGCGCA CGACCTGCTT CAACGCGTCT TTGGTTACGA CGATTTCCGT   61GGTCCGCAGC AGGACATCGT GGAGCATGTG GCTGCCGGTC ACGACGCCCT GGTGCTGATG  121CCCACCGGCG GCGGCAAATC GCTGTGCTAC CAGGTCCCAG CCCTGCTGCG TGACGGTTGC  181GGCATCGTCA TCTCGCCGCT GATCGCACTG ATGCAGGACC AGGTCGAAGC CCTGCGCCAG  241CTCGGCGTGC GCGCCGAGTA CCTGAATTCA ACCCTGGACG CCGAGACCGC CGGCCGCGTC  301GAGCGCGAGC TGCTCGCCGG CGAACTGGAC ATGCTGTATG TCGCCCCTGA GCGGCTGCTG  361AGCGGGCGTT TCCTGTCGCT GCTGTCGCGC AGCCAGATCG CCCTGTTCGC CATCGACGAA  421GCACACTGCG TGTCGCAGTG GGGCCATGAC TTCCGCCCCG AATATCGCCA GTTGACCGTG  481CTGCACGAGC GTTGGCCGCA GATCCCGCGG ATCGCGCTGA CCGCCACCGC CGATCCGCCG  541ACCCAGCGCG AGATCGCCGA GCGCCTCGAT CTGCAGGAAG CGCGCCATTT TGTCAGTTCC  601TTCGACCGCC CCAACATCCG CTACACCGTC GTGCAGAAGG ACAACGCCCG CAAGCAGCTG  661ACCGACTTCC TGCGCGGCCA CCGTGGCGAG GCCGGCATCG TCTACTGCAT GTCGCGGCGC  721AAGGTCGAGG AGACCGCTGA ATTCCTCTGC GGCCAAGGCG TCAACGCTCT GCCTTACCAC  781GCCGGCCTGC CGCCGGAAGT GCGCGCCAGC AACCAGCGCC GCTTCCTGCG CGAGGACGGC  841ATCGTGATGT GTGCCACCAT CGCCTTCGGC ATGGGCATCG ACAAGCCGGA CGTGCGTTTC  901GTCGCGCATA CCGACCTGCC CAAGTCGATG GAGGGCTACT ACCAGGAAAC CGGACGCGCA  961GGCCGCGATG GCGAAGCCGC CGAGGCCTGG CTGTGCTACG GCCTGGGTGA TGTGGTACTG 1021CTCAAGCAGA TGATCGAGCA GTCCGAGGCG GGCGAAGAGC GCAAGCAGCT GGAACGGGCC 1081AAGCTCGACC ATCTGCTGGG CTACTGCGAA TCGATGCAGT GCCGCCGCCA GGTGCTGCTG 1141GCCGGCTTCG GCGAAACCTA CCCCCAACCG TGCGGCAACT GCGACAACTG CCTGACGCCA 1201CCGGCCTCGT GGGACGCGAC CATACCGGCA CAGAAGGCGC TGAGCTGCGT CTACCGCAGC 1261GGCCAGCGCT TCGGTGTCGG CCACCTGATC GACATCCTGC GCGGCAGCGA GAACGAGAAG 1321GTGAGGCAGC AGGGCCACGA CAAGCTGAGC ACTTATGCCA TCGGCCGCGA CCTGGATGCA 1381CGCACCTGGC GCAGCGTGTT CCGCCAGCTG GTCGCGGCCA GCCTGCTGGA AGTGGACAGC 1441GAGGGCCACG GCGGCCTGCG CCTGACCGAC GCCAGCCGCG ACGTGCTGAC CGGCCGCCGC 1501CAGATCAGCA TGCGCCGCGA CCCGGCCAGC AGCAGCAGCG GACGCGAGCG CAGTGCGCAG 1561CGCACCGGCC TGTCGGTACT GCCGCAGGAC CTGGCCCTGT TCAACGCGCT GCGCGGCCTG 1621CGCGCCGAAC TGGCCCGGGA ACAGAACGTA CCGGCGTTCG TGATCTTCCA CGACAGCACC 1681CTGCGCAACA TCGCCGAGCG GCGCCCGACC AGCCTGGATG AACTGGCCCG GGTCGGCGGC 1741ATCGGCGGTA CCAAGCTGAG CCGCTATGGC CCGCGCCTGG TCGAGATCGT GCGCGAAGAA 1801GGCCTGTTGC TCAACGGGCT GAACGCGGCC ATGGCCCGTG GTCACGAAGA AATGGGGCGG 1861ATGGCCCACG CCGCAGCCGC TGCTGTTGAT GGCGGCACTG CCGACTGCCA CCACCACGCC 1921GCCATGCAGG CCGACCCGGC CCCGCAGGCC AAGGCCCCGG CCCACGACGC CCACTGCCAG 1981ATCAAGGACT GCGTGCGCAG CTGCGCCCAG CACCCGCTGC TGGTGGTGCA GCCGTTGCCG 2041TTCATGGCCG GACCGGCACT GTCGCTGGCC CCGCAGCCGA TGCCGGCCAC CGGCCGGCCG 2101GCGCCCCCGT CTGCCGCCGA TCTCACGCCC TCCCATCGGC TGATTCCACA CGCACCGGCC 2161TGGCCGCCGG TGGCGTGGTT GCCGGCATCG CCGCTGTCGG CGTGCCGCAG CGCGTGCTCG 2221CCGCCGCCAC TGCCGCCCCA CGCCTGGCCG GCGCCCCCGC CGTGCTCAGC GACACCCGCA 2281TCGAACTGGC CATCGGCGAA TCGCTGGCCA ACTTTCACTG GCCGCACCCG TCCGGCGATC 2341ACCGTCAATG GATCGCTGCC GGCACCGATC CTGCGCTGGC GCGAAGGCCA GACCGTGGAC 2401CTGTTCGTGC GCAACACGCT GGACCGCCAC CCGACCTCGA TCCATTGGCA CGGCATTCTG 2461CTGCCGGCCA ACATGGACGG CGTGCCCGGC CTGAGCTTCA ATGGCATCGG CCCCGGTGAG 2521ACCTACCACT ACCACTTCGA ACTGAAGCAG TCGGGTACCT ACTGGTACCA CAGCCACTCG 2581ATGTTCCAGG AGCAGGCCGG CCTGTACGGA GCGCTGATCA TCGACCCGGC CGAGCCGGCG 2641CCCTACCAGC ACGACCGCGA GCACGTGATC CTGCTGTCCG ACTGGACCGA CATGGACCCC 2701GGCGCGCTGT TCCGGCGCAT GAAGAAGCTC GCCGAGCATG ACAACTACTA CAAGCGCACC 2761CTGCCCGACT TCCTGCGTGA CGTGAAGCGC GACGGTTGGT CGGCCGCGTT GTCCGACCGT 2821GGCATGTGGG GGCGGATGCG GATGACGCCC ACCGACATCT CCGACATCAA TGCGCACACC 2881TACACCTACC TGATGAATGG CACCGCGCCG GCCGGCAACT GGACCGGGCT GTTCCGCAGC 2941GGCGAGAAAG TACTGCTGCG CTTCATCAAC GGCGCCTCGA TGACCTACTT CGACGTGCGC 3001ATTCCCGGCC TGAAGATGAC CGTGGTCGCC GCCGACGGCC AGTACATCCA TCCGGTCAGC 3061ATCGACGAGT TCCGCATCGC GCCGGCCGAA ACCTACGACG TGCTGGTGGA ACCGACCGGG 3121CAGGACGCGT TCACCATCTT CTGCCAGGAC ATGGGCCGCA CCGGTTCCCG CGCGCGACCC 3181ACGCCCGTTG CTGACGATAG CGACATGGGG CACGACATGG GTAGTGGTGG CCATGGTGGC 3241CACGACATGG CCGCGATGAA GGGCATGGAA GGCGGCTGCG GCGCCAGCAT GGACCACGGT 3301GCGCACGGCG GTAGCGATGC CGCCAGCAAG GCACCGAAGC ACCCGGCCAG CGAACGCAAC 3361AACCCGCTGG TGGACATGCA GAGCTCGGCC ACCGAACCGA AGCTGGACGA TCCCGGCATC 3421GGCCTGCGCG ACAACGGTCG CCAGGTACTC ACCTACGGCG CGATGCGCAG CCTGTTCGAG 3481GACCCCGATG GCCGCGAGCC GAGCCGCGAG ATCGAGCTGC ACCTGACCGG CCATATGGAG 3541AAGTTCTCCT GGTCATTCGA TGGCATTCCG TTCGCCAGCG CCGAGCCGCT GCGGCTGAAC 3601TACGGCGAGC GCATGCCATC TGATCTGGAG AACGCGCAGG GCGAATTCCA GCTGCGCAAG 3661CACACCATCG ACATGCCACC CGGCACCCGC CGCAGTTACC GCGTGCGCGC CGATGCGCTC 3721GGTCGCTGGG CCTACCACTG CCATCTGCTC TACCACATGG AAGCGGGCAT GATGCGCGAA 3781AACAGCACCG GCCAGGCCTG GGAGGCCACC GGCTGGATCG GTGGCAACAT CAACCGCCTG 3841TGGTTGCGCA CCGATGGCGA ACGCAGCCGC GGCCGCACGG AATCGTCGTC ACTGGAAGCA 3901CTGTATGGTC GCAGCGTATC GCCGTGGTGG GACGTGCTGG GCGGCGTGCG CCAGGACTTC 3961CGGCCGGCCG ACTCGCGCAC CTGGGCGGCC ATCGGCATCC AGGGCCTTGC ACCGTACAAG 4021TTCGAGAGCT CGGCAACGCT GTACATGGGT TCCGGCGGCC AGGTGCTGGC CAAGGCCGAG 4081GTCGAGTACG ACGTGCTGCT GACCAACCGC CTGATCCTGC AGCCGCTGCT GGAAGCCACC 4141ATCGCAGCCA AGGATGAACC GGAGTACGGC ATTGGTCGCG GACTGAACAA GATCCGCCGC 4201GCCACCCTTG CCGATGTCGA CGCGCTGTCG ACCATCGCCA TCACCACCTA CAACGAAACC 4261TGGGGCGACT CGTATCCGGC GCAGGAGCTG CAGGATTTCC TGCAGGCGCA CTACAGCAGC 4321GAACCGCAGC GCGCCGAGTT GTCCGACCCG CGCAGTGCGA TCTGGCTGCT GTTGGACGGC 4381GACAACGTGG TCGGCTACCT GGCCGCCGGT GCCAACACCC TGCCGCATGC CGAAGCCCGC 4441GAGGGCGACA TCGAACTGAA GCGCTTCTAC ATCCTGGCCG ACTACCAGAA CGGCGGCCAC 4501GGCGCGCGCC TGATGGACGC GTTCATGGCC TGGCTGGACC AGCCGCAGCG CCGCACCCTG 4561TGGGTGGGCG TCTGGGAGGA GAACTTCGGC GCGCAGCGCT TCTACGCGCG CTACGGCTGC 4621AGCAAGGTCG GCGAGTACGA CTTCATCGTC GGGGATACGC GCGACCGCGA GTTCATCCTG 4681CGCCGGCTGT GA // SEQ ID NO: 104 Amino Acid Sequence of OsRecQCDEFINITION OsRecQC protein 1563 amino acids ORGANISMOryza sativa Cultivar NipponbareMASRPAHDLLQRVFGYDDFRGPQQDIVEHVAAGHDALVLMPTGGGKSLCYQVPALLRDGCGIVISPLIALMQDQVEALRQLGVRAEYLNSTLDAETAGRVERELLAGELDMLYVAPERLLSGRFLSLLSRSQIALFAIDEAHCVSQWGHDFRPEYRQLTVLHERWPQIPRIALTATADPPTQREIAERLDLQEARHFVSSFDRPNIRYTVVQKDNARKQLTDFLRGHRGEAGIVYCMSRRKVEETAEFLCGQGVNALPYHAGLPPEVRASNQRRFLREDGIVMCATIAFGMGIDKPDVRFVAHTDLPKSMEGYYQETGRAGRDGEAAEAWLCYGLGDVVLLKQMIEQSEAGEERKQLERAKLDHLLGYCESMQCRRQVLLAGFGETYPQPCGNCDNCLTPPASWDATIPAQKALSCVYRSGQRFGVGHLIDILRGSENEKVRQQGHDKLSTYAIGRDLDARTWRSVFRQLVAASLLEVDSEGHGGLRLTDASRDVLTGRRQISMRRDPASSSSGRERSAQRTGLSVLPQDLALFNALRGLRAELAREQNVPAFVIFHDSTLRNIAERRPTSLDELARVGGIGGTKLSRYGPRLVEIVREEGLLLNGLNAAMARGHEEMGRMAHAAAAAVDGGTADCHHHAAMQADPAPQAKAPAHDAHCQIKDCVRSCAQHPLLVVQPLPFMAGPALSLAPQPMPATGRPAPPSAADLTPSHRLIPHAPAWPPVAWLPASPLSACRSACSPPPLPPHAWPAPPPCSATPASNWPSANRWPTFTGRTRPAITVNGSLPAPILRWREGQTVDLFVRNTLDREPTSIHWHGILLPANMDGVPGLSFNGIGPGETYHYHFELKQSGTYWYHSHSMFQEQAGLYGALIIDPAEPAPYQHDREHVILLSDWTDMDPGALFRRMKKLAEHDNYYKRTLPDFLRDVKRDGWSAALSDRGMWGRMRMTPTDISDINAHTYTYLMNGTAPAGNWTGLFRSGEKVLLRFINGASMTYFDVRIPGLKMTVVAADGQYIHPVSIDEFRIAPAETYDVLVEPTGQDAFTIFCQDMGRTGSRARPTPVADDSDMGHDMGSGGHGGHDMAAMKGMEGGCGASMDHGAHGGSDAASKAPKHPASERNNPLVDMQSSATEPKLDDPGIGLRDNGRQVLTYGAMRSLFEDPDGREPSREIELHLTGHMEKFSWSFDGIPFASAEPLRLNYGERMPSDLENAQGEFQLRKHTIDMPPGTRRSYRVRADALGRWAYHCHLLYHMEAGMMRENSTGQAWEATGWIGGNINRLWLRTDGERSRGRTESSSLEALYGRSVSPWWDVLGGVRQDFRPADSRTWAAIGIQGLAPYKFESSATLYMGSGGQVLAKAEVEYDVLLTNRLILQPLLEATIAAKDEPEYGIGRGLNKIRRATLADVDALSTIAITTYNETWGDSYPAQELQDFLQAHYSSEPQRAELSDPRSAIWLLLDGDNVVGYLAAGANTLPHAEAREGDIELKRFYILADYQNGGHGARLMDAFMAWLDQPQRRTLWVGVWEENFGAQRFYARYGCSKVGEYDFIVGDTRDREFIL RRL //

In a similar manner, orthologs of the above rice OsRecQ genes wereidentified in maize (SEQ ID NO:105 and SEQ ID NO:106). As will beappreciated by those of skill in the art, others of such orthologs canbe identified in other monocot or dicot species using the rice OsRecQamino acid sequence as a query. Standard molecular methods can be usedto clone these sequences from other plants.

SEQ ID NO: 105 ZmRecQa cDNA from Zea mays LOCUS ZmRecQa 1185 bp ORGANISMZea mays REFERENCE 1 (bases 1 to 1185) BASE COUNT347 a 258 c 277 g 303 t ORIGIN    1GCACGAGCGC AAGGCAAGCT TTCCGCTTCC TATTTCGGAT TGGGATCATC AGCGGCTGTA   61GCGTGGACCC GACGGGGGTG TCCGGACCAC ATCCCTATTT CATCTTGGTA CCCCGTCCGT  121CTCCGATTTC AGAAGCACGG CGGGCTCCCC GGCAGCCTCT ACCGAGCAGA AAGCTGAGTT  181CTACCCCAGA ACCGAGGCAT GGAGGACGAA GAAAACATCG AGGGAGAACT GTTGCTCGTG  241GAGTCACAAC TCCACGACAT CCAAGGACAA ATTAAAACAT TACTCGATCG CCAAGAGGAG  301TTGTATGAAC GCCAGGCACA GTTGAAGGCT TTGCTCGAAG CATCTAAATT GACCAGAAAT  361ACAACAATTA ACACATCTTC AGTTGCTCCG GAAGATTGGT CTGGGAGCTT CCCATGGGAT  421CTGGAGGCTG ACGATACCAG GTTCAATATA TTTGGCATTT CCTCCTACCG ATCAAATCAA  481CGAGAAATAA TTAATGCAGT CATGAGTGGA AGAGATGTTC TGGTCATAAT GGCAGCTGGT  541GGAGGGAAGA GTCTATGTTA CCAGCTCCCA GCTGTACTTC GTGATGGAAT TGCACTGGTT  601GTCAGTCCTT TACTTTCCCT TATTCAGGAC CAGGTCATGG GACTGTCAGC TTTAGGTATA  661CCAGCATACA TGCTAACTTC AACTACCAAC AAGGAAGTTG AGAAGTTCAT CTATAAGACA  721CTTGATAAAG GAGAAGGAGA ACTAAAGATA TTATATGTGA CACCTGAAAA GATCTCAAAA  781AGTAAAAGGT TCATGTCTAA GCTCGAGAAA TGCCATCATG CCGGTCGTCT TTCTCTGATT  841GCAATAGATG AGGCTCACTG CTGTAGCCAA TGGGGTCATG ATTTTCGTCC TGACTACAAG  901AATCTTGGCA TTTTGAAAAT TCAATTTCCC AGTGTTCCAA TGATAGCTTT AACTGCAACT  961GCAACAAGTA AGGTCCAAAT GGATTTAATG GAGATGCTCC ACATCCCGAG ATGCATCAAG 1021TTTGTCAGCA CAGTTAACAG GCCCAACCTT TTTTATAAGG TGTCTGAGAA ATCGCCAGTT 1081GGAAAGGTTG TCATTGATGA GATCACAAAG TTTATAAGTG AATCATACCC AAATAATGAG 1141TCTGGAATTA TATACTGCTT TTCAAGGAAG GAATGTGAAC AGGTT // SEQ ID NO: 106ZmRecQb cDNA from Zea mays LOCUS ZmRecQb 870 bp ORGANISM Zea maysREFERENCE 1 (bases 1 to 870) BASE COUNT 239 a 200 c 242 g 189 t ORIGIN   1 CTTGAGGATC CCCAACGCTG TGGTACTGAA GAGGAGCTTC GACAGACTGA ACCTCAACTA  61 CGAGGTAATC GGCAAGACGA AAACTTTCCA GAAGCAGCTG GGCGATCTCC TGAAAGAGCG 121 CTTCATGAAC GAATCTGGTA TCGTGTACTG TCTCTCGAAG AACGAGTGTG CAGACACTGC 181 CAAGTTTTTG AGGAAGAAAT ACAAGATCAA ATGCGCGCAC TACCACGCTA GCCTGGCAGC 241 TCGTCAGCGA ACCAGTGTCC AGGAGAAATG GCACAACGGG GAGGTTAAGG TCATCTGCGC 301 TACCATAGCC TTCGGCATGG GGATCGACAA ACCTGACGTG CGTTTTGTTA TCCACAACAC 361 ATTGTCCAAG TCAATAGAAA GCTACTACCA GGAGTCCGGG AGGGCAGGGC GAGATGAGCT 421 TCCGGCACAC TGTATCGTCT TGTACCAGAA GAAAGACTTC AGCCGTATCG TGTGCATGTT 481 GAGGAACGGT GAGAACTTCA GGAGCGAGAG CTTCAGGGTT GCGATGGAGC AAGCTAAGAA 541 GATGCAGGCA TACTGCGAGC TCAAGACCGA GTGCCGGAGA CAGGCACTTC TGCAGCACTT 601 CGGCGAACAG TACGACAGGC GAAGGTGCCG AGACGGGCCT AGCCCCTGCG ACAACTGCCT 661 CAAGACATAG TTTAGGGTAA TAAACTATGG CGATAAAAAA TGCCATGACG CTTGGTTATG 721 CTCTGAACTT GTGAGGTGTG TGCCACTTCC ACAGTACATT CGTCTGTGTA TATGTAGCAT 781 CCATAGCTCA AACAAGTGGC CGCAACTGCA CTGTGTGTAA CGATGGTCTT TGTTTTCAGT 841 TGGATTGTGA GGTTCGGGGC TTTAAAAAAA //

2. Suppression or Down-Regulation of the OsRecQ Gene Expression toEnhance the Efficiency of Targeted Integration through HomologousRecombination: Antisense Suppression, Sense Co-Suppression, dsRNAi, GeneKnockout, and the Use of Dominant Negative Mutants.

E. coli and yeast cells deficient in RecQ show an elevated level ofhomologous recombination activity (Nakayama et al. 1985 Mol. Gen. Genet.200, 266-271; Watt et al. 1995 Genetics 144, 935-945). The above riceand maize RecQ sequence homologs can be used to down-regulate RecQexpression levels and thereby enhance targeting frequency in thepreviously described target maize and rice lines. Similarly, RecQhomologs from other plants can be used to enhance the frequency andefficiency in those plants of targeted integration through homologousrecombination. (See, Bagherieh-Najjar, de Vries, Hille, and Dijkwel,“Increased Homologous Recombination and Altered DNA Damage Response inthe Arabidopsis recQ14A Mutant,” attached hereto and forming a parthereof).

Down-regulation can be achieved ectopically by a transgene using methodsin the art, including homology dependent gene silencing (antisensesuppression, sense suppression, dsRNAi, virus mediated silencing) anddominant-negative mutants of the gene. For homology-dependent silencing,only part of the gene is needed to initiate silencing of the gene. Forexample, a segment of sense and/or antisense OsRecQ mRNA sequence can beplaced under the control of a constitutive or tissue-specific promoterto initiate gene silencing of native genomic OsRecQ genes. Dominantnegative mutants are defective variants of a protein, usually deficientin one or more functions that the protein normally has. For example,RecQ has a helicase domain and also interacts with other proteins tocarry out its normal biological functions. A dominant negative mutantRecQ may lose its helicase activity but still retain its interactionswith other proteins. Sometimes a dominant negative mutant is a truncatedprotein.

A particular RecQ gene can also be knocked out totally, and plant lineswith a RecQ gene knock-out can be used in gene targeting. In plants,mutagenesis methods such as transposon, T-DNA insertion, UV, gamma rays,X-rays, and chemicals can be used to inactivate these genes. Thematerials with reduced RecQ expression obtained by the above methods arethen used as target tissue when the targeting methods disclosed hereinare carried out. For example, rice transgenic target lines with apNOV5025 or pAdF55 T-DNA insertion locus can be introgressed into lineswith the OsRecQ dsRNAi knockout locus, and the resulting linescontaining both loci can be re-transformed with the targeting donorvector pQD200C6 or pAdF77, with or without another recombinationenhancing vector (e.g., pNOV5033, which expresses the I-CeuIendonuclease to make a dsDNA break at the target locus). Similarly,maize transgenic target lines with a pNOV5025 T-DNA insertion locus canbe introgressed into lines with the ZmRecQ dsRNAi knockout locus, andthe resulting lines containing both loci can be re-transformed with thetargeting donor vector pQD200C6, with or without the recombinationenhancing vector pNOV5033, which expresses the I-CeuI endonuclease tomake a dsDNA break at the target locus. Down-regulation of the RecQ genecan also be carried out transiently by introducing the interferingprotein or RNA or RNA expression cassette during the targeting process,such as, for example, during the Agrobacterium-mediated delivery andtransformation of the donor T-DNA into the host cell.

Example 35 Over-Expression or Up-Regulation of OsRad54, OsBRCA1,OsBRCA2, and OsSPO11 to Enhance the Efficiency of Targeted IntegrationThrough Homologous Recombination

Some genes encode proteins that are involved in the recombinationmachinery of the cell or that are positive regulators of therecombination process. To clone some of these genes, proprietarySyngenta rice genome (Myriad contigs V8, Nipponbare cultivar) and publicrice genome sequence databases were searched with the TBLASTN programusing the protein sequences of human BRCA1, BRCA2, RAD54, and yeastSPO11. Primers were designed to amplify predicted cDNAs encodinghomologs of these sequences. The following cDNAs were cloned from youngrice flowers or mitomycin C-treated callus tissue: OsRad54A (SEQ IDNO:107), OsRad54B (SEQ ID NO:108), OsBRCA1 (SEQ ID NO:109), OsBRCA2 (SEQID NO:110), OsSPO11A (SEQ ID NO:111), and OsSPO11B (SEQ ID NO:112).These cDNA sequences are useful for increasing targeting efficiency,since over-expression of these genes can increase the frequence ofhomologous recombination in plant cells.

Each of these genes can be put under the control of a regulatedpromoter, such as a tissue-specific or inducible promoter, for example,so that their expression is tissue-specific or transient. In oneembodiment, the recombination enhancing genes is expressed when thedonor sequence is delivered to the target cell. In another embodiment,several of the above recombination enhancing genes are co-expressed inthe host cell to increase the targeting efficiency. The proteins encodedby these genes (the predicted amino acid sequences of which are shown inSEQ ID NOs:113, 114, 115, 116, 117, and 118) can be introduced into thehost cell by any means described herein (such as the methods describedabove with reference to a mega-endonuclease or a recombinase) or methodsthat are otherwise known in the art. Such other methods include, forexample, introducing the protein (or a fusion protein containing theprotein) into the cell through physical or biological means, e.g.,electroporation or Agrobacterium. For example, rice target lines with apNOV5025 or pAdF55 T-DNA insertion locus are introgressed with a linehaving a transgenic locus for the over-expression of these genes, andthe resulting lines containing both loci are re-transformed with a donorvector, such as pQD200C6 or pAdF77, with or without anotherrecombination enhancing vector pNOV5033 which expresses the I-CeuIendonucleases to make a dsDNA break at the target locus. Similarly,maize target lines with a pNOV5025 T-DNA insertion locus can beintrogressed into a line with a transgenic locus for the over-expressionof these genes, and the resulting lines containing both loci arere-transformed with a donor vector, such as pQD200C6, with or withoutthe recombination enhancing vector pNOV5033 which expresses the I-CeuIendonucleases to make a dsDNA break at the target locus.

SEQ ID NO: 107: OsRad54A cDNA from Oryza sativ, (cultivar Kaybonnet)LOCUS OsRad54A cDNA 3569 bp ORGANISM Rice, Oryza sativa cv KaybonnetSOURCE Young flower REFERENCE 1 (bases 1 to 3569) AUTHORS Qiudeng QueCDS 1 . . . 3564 BASE COUNT 1072 a 771 c 865 g 861 t ORIGIN    1ATGGAGGACG ATGACGATGA CCAACGCTTG CTTCACAGCC TTGGTGTCAC GTCCGCAGAC   61ATCCACGATA TTGAAAGGAG AATCATATCA CAGGCAACAA CTGATCCTGC CGACTCATCT  121GGACCAACCA TCAATGGAGG TCATCAGCCT GATGATGCTC TCGCCAAACT GCATCACAAA  181CTGCGCTCTG TGCAAATTGA AATTGATGCT GTAGCCTCCA CCATCAAAGG AGCTAAGCTT  241AAGCAACCAT CCGGAAATAA ACCACATGAG CATAAAGGCA AGGACCAGCC AGATCATCAT  301GGAGCAGGAC ACCTCCAGCA AGCCCTTGCT GCCGACCGTC TTACAAGCCT CAGGAAAGCT  361AAAGCACAGA TACAGAAAGA GATACTACAG TCACATCTTT CTCCATCTGC CTCCAATCGA  421AAAGATAAAA TGCTGGCCAT GCTGGTCCAA GACGAGCCGA GGCACAAAAA GCCACCCGTA  481GGGCCTAAAA ACATCGTGAA ACGCCCGATG AAAACTGTCA CCTATGATGA TGACAACAAC  541TTCGATGCAG TGCTTGATGG AGCCTCTGCG GGATTTATGG AAACTGAAAG GGAAGAACTG  601ATCAGGAAGG GTTTGTTGAC ACCATTCCAT AAGTTGAAGG GCTTCGAGAA ACGTGTGGAA  661CTACCCGAAC CTTCTCATAG ACAAGATGAT TCTGCAGGAC AAACTGAAGA AGCCATGGAA  721GCTTCCAGGA TTGCTAGAGT TGCTCAGTCG CTAAAGCAGA TTGCACAGAA CCGCCCAGCA  781ACCAAATTGC TTGATTCAGA GTCTTTACCT AAGCTAGATG CACCTGCTGC CCCATTTCAG  841AGACTTGGAA AACCCCTAAA GCGTCCTGTC TCTCCCAGTT CAGATGAGCA GGAAAAGAAG  901AGACCAAGAA ATAAGACCAA AAGACCACTG CCTGGCAAGA AATGGAGGAA AGCAAACTCA  961ATTAAGGAAT CATCATTGGA TGACAACGAT GTTGGAGAGG CAGCTGTGTC AGTTTCAGAT 1021GATGATGAAG ATCAGGTTAC AGAAGGCTCT GATGAGTTAA CTGATGTTAC CCTTGAAGGA 1081GGTTTGAGAA TTCCTGGCAC ACTTTACACG CAACTATTTG ACTACCAGAA AGTGGGAGTG 1141CAGTGGCTAT GGGAGTTGCA TTGTCAAAGG GCTGGTGGAA TAATTGGAGA TGAAATGGGC 1201CTGGGAAAGA CTGTGCAGGT CTTGTCATTT CTTGGTTCCT TGCATAACAG TGGGCTCTAC 1261AAGCCTAGCA TTGTTGTTTG TCCTGTAACC CTTTTGCAAC AGTGGCGAAG GGAGGCCAGT 1321AGATGGTATC CAAAGTTCAA GGTTGAGATC TTACATGACT CTGCAAACAG TTCATCTAAA 1381AAGAGCAAGA GGTCTAGTGA TTCTGACAGT GAAGCTTCCT GGGATAGTGA TCAGGAAGAA 1441GCGGTTACAT GTTCAAAACC CGCAAAGAAG TGGGATGACT TGATTTCACG TGTTGTGAGT 1501TCAGGATCAG GTTTGCTTCT GACCACATAT GAGCAGTTAA GGATCCTAGG GGAGAAGTTG 1561CTTGATATAG AATGGGGATA TGCTGTATTG GATGAGGGTC ACCGCATTAG GAATCCTAAT 1621GCTGAGATTA CTCTTGTGTG CAAGCAATTG CAGACCGTGC ACAGGATAAT TATGACAGGT 1681GCACCTATTC AAAACAAACT TTCGGAGCTT TGGTCTCTCT TTGATTTTGT GTTCCCTGGA 1741AAACTAGGTG TCCTGCCTGT GTTTGAGGCT GAGTTTTCTG TTCCAATTAC TGTTGGTGGG 1801TACGCTAATG CAACACCATT GCAAGTGTCC ACGGCGTATC GATGTGCTGT TGTCCTACGT 1861GACCTGGTCA TGCCGTACCT TCTTAGAAGA ATGAAAGCTG ATGTCAATGC ACAGCTTCCC 1921AAGAAAACAG AGCATGTTCT TTTCTGTAGT CTAACTACTG AGCAACGTGC TACTTATCGT 1981GCATTTCTTG CTAGTTCGGA GGTGGAACAA ATCTTTGATG GTAACAGAAA TTCCCTTTAT 2041GGGATAGATG TTCTAAGGAA GATATGCAAT CATCCTGATC TACTTGAGAG AGAACATGCT 2101GCTCAGAATC CTGACTATGG GAATCCAGAA AGAAGTGGAA AGATGAAAGT GGTTGAGCAA 2161GTTCTTAAAG TATGGAAAGA ACAAGGTCAT CGTGTTCTTC TTTTCACTCA GACACAACAA 2221ATGCTTGACA TTATGGGGAA CTTCTTGACA GCTTGCGAAT ACCAATACCG AAGAATGGAT 2281GGACTTACAC CTGCAAAGCA AAGAATGGCA CTTATTGATG AATTCAATAA CACAGATGAA 2341ATTTTTATTT TCATTCTGAC CACGAAAGTT GGTGGACTGG GTACGAATTT GACTGGTGCA 2401AACCGGATTA TTATATATGA TCCTGACTGG AATCCTTCAA CTGACATGCA GGCTAGGGAA 2461CGTGCATGGC GAATTGGGCA AACTAGAGAT GTGACAGTTT ATAGACTGAT CACGCGTGGG 2521ACAATAGAGG AGAAAGTCTA CCATCGTCAG GTATACAAGC ATTTCCTCAC AAACAAAGTA 2581CTGAAAGACC CTCAGCAGAG GCGGTTTTTT AAAGCCAGAG ACATGAAGGA TTTGTTTACG 2641CTGCAAGATG ATGACAATAA TGGCTCAACT GAAACATCAA ATATTTTCAG CCAATTGTCT 2701GAGGATGTGA ATATCGGAGT TCCGAGTGAC AAGCAACAAG ACCAGCTATA TGCAGCCTCT 2761GCTACACCGA CAACCTCTGG GACTGAACCG AGCTCATCCA GGCATGGACA GGGTAAAGAA 2821GACCATTGCC CTGACCAAGC AGATGAAGAA TGCAACATTT TGAAGAGCCT TTTTGATGCT 2881CAAGGCATTC ATAGTGCGAT CAATCATGAT GCCATAATGA ACGCTAATGA TGACCAGAAG 2941CTGCGCCTAG AAGCAGAAGC TACACAGGTG GCACAAAGGG CAGCTGAAGC TTTACGCCAA 3001TCACGGATGC TCAGAAGTCA TGAAAGTTTT TCTGTTCCTA CATGGACTGG AAGAGCTGGT 3061GCTGCGGGGG CACCATCCTC TGTCCGCAGG AAGTTTGGGT CAACACTCAA TACCCAGTTG 3121GTTAATTCTT CTCAGCCATC AGAAACTTCA AATGGCAGGG GCCAAAGTCT TCAGGTGGGT 3181GCTCTAAATG GCAAAGCACT GTCCTCCGCT GAGCTTCTGG CCAGGATACG TGGAACCCGA 3241GAGGGAGCAG CTTCAGATGC ACTAGAACAT CAACTCAACC TGGGATCAGC TTCCAATCAC 3301ACATCGAGTT CATCAGGGAA TGGCCGTGCA TCAAGCTCTT CTACTAGGAG CATGATCGTA 3361CAGCCTGAAG TCCTAATCCG CCAATTGTGC ACCTTCATAC AGCAGCATGG TGGTTCCGCC 3421AGCTCAACAA GTATAACTGA ACACTTCAAG AACCGGATAC TGTCCAAGGA TATGCTGCTG 3481TTTAAGAATC TGCTGAAGGA AATAGCTACG TTGCAAAGAG GTGCAAATGG TGCAACGTGG 3541GTGCTGAAAC CTGACTACCA GTAACTAGT // SEQ ID NO: 108:OsRad54B cDNA from Oryza sativa (cultivar Kaybonnet) DEFINITIONOsRad54B cDNA 3453 ORGANISM Oryza sativa cv Kaybonnet SOURCEYoung flower REFERENCE 1 (bases 1 to 3453) AUTHORS Qiudeng Que CDS1 . . . 3447 BASE COUNT 1134 a 655 c 776 g 888 t ORIGIN    1ATGCGCACAA GCACCACATC AGATAGCCCA TCCCCATCTC CACAAAACAA AGCCTCTTTT   61AACACATCAC GTGGTGCTGC ATTTAGGGAT GAAGAACCAG GTGCAAAAGA CAATGAAGTT  121GAGAAAAGGA AACCATTGAT ATTACATTTG AAGAAGCGTT CAACCAAGGA ACTATCTACA  181GATACCACAT CATCAAAGTC AGGGTTACTT GGAAAGTCTT CAGAAGAGAA ACAGGAGAAA  241CACGGAAGTG CTTTGAAAGT GAAGAAACAT CTGCATCCCA TGGAATTATC TCCAAAGAAA  301TATAAGAACA AGAAGCAACA CAATCACAGA GACAGTAAGA GATCCGAAGC AAAAAAGGTC  361CAATATTTGG CATCAGATGT GGACAGTGAT TCTTCAATGG AACCATCTAC TTCTCTTGAG  421CACAGCGAAT CGCCGCCCCC AAAAAGAAAA TCGTTGGATG GAAGAACACC TGCATCAAGT  481ACCAAGAAAG GAAAAAAGAA AGTGAAATTT ATTGATAAAA AGCACCCTGA GAATGCTGTT  541CATATAACTG AAAAGGAGCA TGGTGGTGCA GGAGACAAAA TAACAACTCA GGGGGATCTG  601CAGGTTGATC GCATCCTAGG CTGTCGACTT CAGACAAGCC AAATCATTTC ACCTGCCCAT  661GCTTCATCAG AGCAGATTGA TATGGCCCCT CCTAGTGCAT CCGGTGCAAC AGAACCTAGT  721CAAGCCCTTT CAAAAGGACT TCATGAAGAA ATTCAGTCTT CTAATAGTGA TACTAATGTG  781ACAGAGGATG CATGTGCTGA TGAATTAGCA AACGATGGTG GGGAAAATAA TTTGGATTGT  841TCTGATGCTC AAAAGGAGAG TAATGTTAGA TCCCATGGAC ACAAGGAATC ACTTAACGCA  901AAAGAAATCA TGAATACAGC ATCAGCATGT TCCGCTGATC AAATTGTCAC AGTTAAGGAT  961GCTGGAGCAG TACAGACATA TGTAACGGCT TCAGTAAATG GTGAATATGA GACAGTAACT 1021GATATTCCAG AAGAAAAGAA TGACACCAAA CATCCAGTTT CCAAAGCTGA CACAGAAGTC 1081CACACTAAAC AAGAACATAC ACCTGATAGT AAATTGCATG GGAAACTAGA AAACTACAAA 1141GCAAAGTACG GAACAGGTTT GATAAACATC TGCAAAGAAC AATGGTGCCA ACCGCAACGA 1201GTTATTGCTC TGCGCACTTC TTTAGATGAA ATAGAAGAGG CTTTGATCAA ATGGTGTGCC 1261CTTCCATATG ACGAATGCAC GTGGGAAAGA TTAGATGAAC CTACAATGGT GAAGTATGCA 1321CATTTGGTCA CTCAGTTCAA AAAATTTGAA TCCCAGGCTT TGGATAAGGA TAAGGGAGGT 1381AGCCATGCAA AGCCAAGGGA ACACCAAGAG TTTAATATGC TGGTTGAGCA GCCAAAAGAA 1441CTCCAGGGAG GCATGCTCTT CCCTCATCAA CTGGAAGCAT TGAACTGGCT ACGCAAATGC 1501TGGTACAAGT CAAAAAATGT TATCCTTGCT GATGAGATGG GTCTTGGAAA GACTGTGTCT 1561GCCTGTGCTT TTCTATCATC CCTATGTTGT GAATATAAGA TTAACTTGCC ATGTCTTGTC 1621TTGGTTCCTC TTTCTACTAT GCCCAACTGG ATGGCTGAAT TTGCATCATG GGCACCTCAT 1681TTAAATGTTG TGGAGTATCA TGGTTCTGCA CGGGCAAGAT CTATTATTCG TCAATATGAG 1741TGGCATGAGG GTGATGCAAG CCAGATGGGT AAAATCAAGA AATCTCATAA GTTCAATGTA 1801TTGCTCACTA CTTATGAAAT GGTGCTTGTT GATGCTGCAT ATCTTCGGTC TGTGTCATGG 1861GAGGTTCTTA TAGTCGATGA GGGTCATCGT CTGAAGAATT CTAGCAGCAA ACTTTTCAGT 1921TTACTCAATA CATTATCATT TCAGCATAGA GTTTTGCTGA CTGGAACTCC GTTACAGAAT 1981AACATTGGTG AAATGTATAA CTTATTGAAC TTCTTACAAC CTGCTTCTTT CCCTTCTCTA 2041GCTTCATTTG AGGAGAAATT CAATGACCTT ACAACAACAG AGAAAGTGGA GGAGCTGAAG 2101AACCTTGTAG CTCCACATAT GCTTCGAAGA CTGAAAAAGG ATGCAATGCA AAATATCCCT 2161CCAAAGACTG AACGAATGGT GCCTGTTGAA TTGACATCAA TCCAGGCTGA ATACTACCGT 2221GCTATGCTTA CAAAGAACTA CCAAGTATTG CGCAATATTG GGAAAGGTGG TGCTCACCAG 2281TCATTGTTGA ACATAGTAAT GCAACTTCGG AAAGTCTGCA ATCATCCGTA TCTTATTCCT 2341GGAACTGAAC CTGAATCAGG ATCACCAGAG TTCTTGCATG AAATGCGAAT AAAGGCCTCA 2401GCAAAGTTAA CTTTGTTGCA CTCTATGCTT AAAATCCTAC ACAAGGATGG TCATCGAGTT 2461CTTATTTTTT CTCAGATGAC AAAGCTTCTT GACATCCTTG AAGATTACCT GACCTGGGAG 2521TTTGGTCCGA AAACATTTGA AAGAGTGGAT GGTTCAGTAT CTGTGGCAGA ACGCCAGGCA 2581GCAATTGCTC GTTTTAATCA GGACAAGAGT CGTTTTGTAT TCCTGCTATC TACGCGGTCA 2641TGTGGGCTTG GAATTAATTT GGCAACTGCA GATACTGTTA TCATATATGA TTCTGATTTC 2701AATCCACATG CTGATATACA GGCAATGAAC AGAGCACACA GAATTGGACA GTCAAACAGA 2761CTTTTAGTTT ACAGGCTTGT CGTGCGTGCT AGTGTTGAGG AGCGTATCTT GCACCTTGCG 2821AAGAAAAAAT TGATGCTTGA TCAACTTTTT GTTAACAAAT CAGAATCACA GAAGGAAGTG 2881GAAGATATCA TTCGCTGGGG AACAGAGGAA CTCTTCAGGA ATAGCGATGT TGCAGTTAAA 2941GATAATAATG AAGCTTCTGG TGCTAAAAAT GATGTAGCAG AGGTTGAGTT TAAGCATAAA 3001AGAAAAACTG GTGGACTAGG CGATGTTTAT GAAGACAGAT GTGCTGATGG TTCTGCTAAA 3061TTTAATTGGG ATGAAAATGC TATCACAAAG CTTCTTGACA GATCCAACGT TCCATCAACA 3121GTAGCTGAAA GCACTGATGG GGACTTGGAC AATGATATGC TTGGCACTGT AAAGTCAATA 3181GATTGGAACG ATGAGCTGAA TGATGACCCT GGTGCCACCG AGGACATCCC AAATATTGAT 3241AATGATGGTT GCGAGCAGGC ATCTGAAGCA AAGCAGGATG CAGCTAATCG TGTTGAAGAA 3301AATGAATGGG ATAAACTCTT ACGTGTCAGA TGGGAGCAGT ATCAAACTGA GGAGGAAGCA 3361TCTCTTGGTC GAGGTAAGCG TTTAAGGAAG GCTGTTTCTT ACAGGGAAAC ATTTGCAACC 3421ATTCCTAATG AAGCTTTAAG CGAGTAGAAC TAG // SEQ ID NO: 109:OsBRCA1 cDNA from Oryza sativa (cultivar Kaybonnet) DEFINITIONOsBRCA1 cDNA 2964 bp ORGANISM Oryza sativa cv Kaybonnet SOURCEYoung flower AUTHORS Qiudeng Que CDS 1 . . . 2964 BASE COUNT957 a 623 c 694 g 690 t ORIGIN    1ATGGCGGACA CGGGGAGCCT GGAGAAGATG GGGCGAGAGC TCAAGTGCCC CATCTGCCTG   61AGCCTTCTCA GTTCGGCGGT ATCCATCTCC TGCAACCACG TCTTCTGCAA TGATTGCCTC  121ACGGAATCGA TGAAATCCAC GTCGAGCTGC CCCGTGTGCA AGGTCCCGTT CCGACGACGA  181GAAATGCGAC CAGCACCTCA CATGGACAAT CTGGTCAGCA TTTTCAAAAG CATGGAGGCT  241GCAGCAGGTA CCAATGTTGT CTCAACACAG GAGGCTCCTG TGGTAAAACT TGCAGATGGA  301TCAGATTGTG TCAACAGCGG GAAAAATTCC AAAAGGTCAC AAAAATCATT GACACGAAAA  361AGGAAGGTAA CATCCGAGAT GGAAAAAAAT ACAGCAAAGG ATGCTACAGC TTCTGCATCC  421CAACCTACTA CAAAGCCTTC CTTCTCTACT AACAAAAGAA TACAAGTGAA ACCATTCCCT  481GAATCTGAGA CACCAATAAG AGCTGAGAAG ATTATGAAGC CTGAAGAGCC AAAAAATAAT  541CTGAATAATG ATGTTGAAGG AAAGAATAAA GCAGTGGCAT CGGGTCAACC TGGAAGTCCT  601TCATTGTCAC CCTTTTTTTG GCTAAGGGAA CAAGAAGAAC AAGAAGGCTG TACCGCTGAG  661ACGTTAAGTG AAACGCAATC TTTAGACACA CCCTTGCGTC ATAATGCACC CTCTTTTAGC  721GATATTAAAG ATTCTGATGA CGAAATCCCT TTAAATACAA CTCCAAATAG CAAAGCTGCG  781GCTACAGAAC TCTTTGACAG TGAAATATTT GAATGGACCC AGAGACCATG CTCTCCTGAA  841TTGTATTCCA CTCCATTGAA AAAGCAGAGT AAAGCTAAGA GTAAACTAGA TCAAATTGAA  901GAGAAGGGTG ATGAAGAAGA TGTGCATATT GGTGGTTCAT TTGATAAGCT GGGCAGTGCA  961AGTAATGCAG CTCAGCTTGT CAATACAAAA GCAACAAAGC AGAAGAGAAA GAAAACAAGT 1021CCCAGTAACA AAAACAGTGC AAAATTGTCC AATCGTGCTG AGCCCTGCAT AAAAAAGTCT 1081GATGCCAATC AACAAGGTTC AAATAGACGT AAAAGTGCTG CCCTAAAATC TTGTCAGAAA 1141AGCAGCAGTG CTGTAGGGAG GAATACTTCA GGTAGAAGAA ACAAGGCCTC TAGCAACAGC 1201AAGCCAATTC ATGGCTCTAG TGATAACTCC CCAGAGTCAT ATCTTCCTAA GGAGGGTTTG 1261GATGTTGAAG CACCTGACAA ACCCCTTTCT GAAAGGATCC AAAACTTGGA GAAAACTAGT 1321CGACGAAAGG GAAGTGCAAG GAAGCTGGAA ATGGCAGGGA AAACTATTTC AGATACTACA 1381GAGAAGAATA GTGAGCCAAG AAGTAAGAGA GTCAGAAGAA TGTCTGACCA CGCTATAGCT 1441AAACCGGTTG AAGTTCCTTC AGGATCTGGA AATGAAACAG AAATACCACA GCTTCACACC 1501CTCACAAAAG GCAGCATTCA ACGCAAATCC TCCAACGCTA GAAGACATAG CAAAGTTTGT 1561GGAGAACAGG AAGGTAAGAA TAAACTTGAG AACACGACAA TGACACCTAT TATTTTACAT 1621GGGAAATGCC AAAATAAAGA GGCAGTATGT ACAGCTCCTT CAGTAAGGAC TGCATCTGTT 1681AAGTACAAGC AAGCAAAATT TAGCGAACAA CCAGATTGTT TTGGAACGGA GAACTTTGGA 1741AACCTTCAAG CATGCCCTGC ACGTAATGTT TTACTGAAGA AGTGTGAGGT ATCTACTTTG 1801AAGGTTTCCT GTGCTTTCTG CCAGACCGAT GTCATCACAG AGGAGTCTGG AGAGATGGTT 1861CATTATCAAA ATGGGAAGCA AGTCCCTGCA GAGTTCAATG GAGGAGCCAA TGTGGTGCAC 1921TCTCACAAGA ACTGCCTTGA GTGGGCTCCT GATGTCTACT TCGAAGATGA TTCTGCCTTT 1981AATCTTACAA CTGAATTGGC GAGAAGCAGA CGGATCAAAT GTGCTTGCTG TGGAATTAAA 2041GGAGCTGCAC TTGGATGCTT TGAGATGAGT TGTCGGAGAA GTTTCCACTT CACCTGTGCT 2101AAACTAATCC CAGAATGCAG ATGGGATAAT GAAAATTTTG TGATGTTATG CCCTCTACAT 2161CGGTCTACAA AGTTACCCAA TGAAAATTCT GAACAGCAAA AGCAACCTAA AAGGAAAACA 2221ACACTCAAAG GGTCATCTCA AATAGGATCC AATCAAGATT GTGGTAATAA CTGGAAATGG 2281CCATCTGGAT CACCACAGAA GTGGGTTCTC TGCTGCTCAT CACTTTCTAG TTCTGAGAAG 2341GGACTTGTAT CAGAATTTGC AAAGTTAGCT GGCGTGCCTA TTTCGGCAAC TTGGAGTCCA 2401AATGTTACCC ATGTTATTGC ATCAACTGAT CTCTCTGGTG CTTGCAAACG GACGCTGAAG 2461TTTCTCATGG CAATCTTGAA TGGCAGATGG ATTGTCTCCA TAGATTGGGT TAAAACTTGC 2521ATGGAGTGCA TGGAACCAAT TGATGAGCAC AAATTTGAAG TCGCTACTGA TGTTCATGGG 2581ATCACTGATG GTCCTAGGTT AGGAAGATGC AGGGTTATTG ACAGGCAACC TAAGCTGTTC 2641GACAGCATGA GGTTCTACCT CCATGGGGAC TACACAAAAT CCTACAGAGG CTACCTGCAA 2701GATCTCGTGG TTGCAGCAGG TGGAATAGTT CTTCAGAGGA AGCCCGTATC AAGAGACCAG 2761CAAAAGCTTC TTGATGACAG CTCTGACCTC CTCATCGTTT ACAGCTTCGA GAATCAAGAT 2821AGGGCAAAAT CCAAGGCCGA AACCAAGGCT GCTGATCGCA GGCAGGCTGA TGCTCAGGCT 2881CTTGCTTGCG CTTCTGGAGG CAGAGTTGTG AGCAGTGCAT GGGTGATTGA CTCAATTGCA 2941GCCTGCAATC TGCAACCTCT TTGA // SEQ ID NO: 110:OsBRCA2 cDNA from Oryza sativa (cultivar Kaybonnet) DEFINITIONOsBRCA2 cDNA 4500 bp ORGANISM Oryza sativa cv Kaybonnet SOURCEOryza sativa cv. Kaybonnet Mitomycin-C treated calli REFERENCE1 (bases 1 to 4500) BASE COUNT 1379 a 856 c 1102 g 1163 t ORIGIN    1ATGGCTGACC TCTTCAACCA AGCTTTGGAT AAGCTGGTTG CTGCTGATGG AATGGCCGAA   61GCGATCGAGG ATTCAGGGAA AGGTGCGGTG TTCTGCACTG GGTTGGGGGG ATCAGTTGCC  121GTCAGCGAGA GGGCTGTAGA GAGGGCCAAG GCATTGGTTG GGGAGGTCGC GGAGGAGATA  181AGTAATGAGA GGAGGCAACC ATTTGGTGAT GGTTCTAATT TGGAGTGCGG ATTGGGAGAA  241AGTAATGTTT CATTTAAAGG TGGTGTACAT AAAGATAGTT TGTCTCCGAT GTTCCAAACC  301GGATCGGGTA AAATGGTTTC GCTGAGCAAG GGCTCAATTC AGAAGGCTAG AGCTGTTTTA  361GAAGGAAATG CCGAGAATTC TTCTGTCATT GCTGTACAGT CTATGTTCCA TACTGGATTG  421GTTAGGCCAG ACCCAGTCAG CAGGAGCTCC ACTGATAATG CAATGACTGT TTTGGAGGGA  481CAAACAAATC CAAAACAAGG AGATGTGGCA GATGTGTATG ACAAGGAAAA TTTTCCATTG  541TTCCAAACTG GTTCAGGTAA AGCTGTATCG GTCAGTGTAG CATCTATCCA GAAAGCTAAG  601GCTGTCCTGG AGCAAAATAA TACAGAAAAC ACGGAAGATT TTGGTAGGCC TGACCAATCT  661CTGATTTTCC AAACTGGTTC GCGAAGACCA GTCTTGATCA GTGAAAGATC TAGCTCTGTG  721GTGAAGGATG GAGGTGCTGA AAATATTGTG TTCCAAACGG GGTTAGGGAG GCCTGTTGTG  781GTGAGCCAGA CCTCAATTCA AAAGGCAAGG ACAGTATTAG ATCAAGAATG TGCCAAAAGA  841AGTGGACATG GAGATACTAA TGTCTCCACC ACTACTTTTC AAACTGAAAC ACCAACGCCT  901GTTCTGATGA GTGGTGGCCT GACTATGAAT GATAGATCTG TTACACCTGA GGGGGGTGTT  961TCAATGCAAG GAAATTTTTT GGAGGCTGAT GGTCACTTGC CATTATTTCA AACTGGGTTA 1021GGGAGGTCCA TTTCAGTAAG TAAAGGCTCA ATTAAGAGAG CAAGTGCACT TCTGGAGCCA 1081AGGAACATTA CAAAAGAACT GGAAGATGAA GCTCACTCAG ATGATGGCTG TGCCACTCCA 1141ATGTTCAAAA CTGGATCAGG AAGGTCTATC ACAGCAAGTG AAAATTCTAG AAAGAAAGCC 1201CACGTTGTCT TAGAGGGCGA GGAACCAGTA AAAAATGTAA ATAATGACAC TGGAGAAGCC 1261ATTGCTCCAA TGCTCCATGC TGGAATGCAG AAGTTTGCAC CCCAAAATAG AAACTCAAGT 1321CATAAGGCGA TCACCCTCAT GGAGCAAGGG AGCTCTATGG AAGAAGACCG TGGAAACGAA 1381CCACCAATGT TTCGAACTGG ATCTGGGAAG TCAGTCTTGA TTAGTCACAG CTCCGTGCAG 1441AAGGCAAGGG CGGTTCTGGA GGAAGAAGGC AATATGAAGA AAGAAAATCA CAAACAACTT 1501AGCAATGTGG ACAAATATAT TCCGATCTTT ACTTCACCTC TCAAGACAAG CTATGCAAGG 1561ACTGTACATA TATCTTCAGT TGGTGTTTCT CGAGCTGCAA CTTTGTTGGG TTTGGAGGAG 1621AATACCCTTT CAACACAACT TTTAGGACAT GTGGGTGATA AGCTAGGTAC AAAGATAACT 1681GTTGAGAGGG AAAATTCAGA GCACCAGTTT GGTGTAGCAT CAGTCAGTGG AATTTCTGGT 1741GGCTGCCCTA TAAGCTCTGG CCCAGCTGAA AACCAAGTAC TTATGGATCC ACATCAGCAT 1801TTTGCATTTT CTAAAACAAC GTTCTCTGAT TCCAGTGAGC AAGCTATCAG GTTCAGCACT 1861GCTGGCGGCA GAACAATGGC TATTCCTAGT GATGCACTTC AGCGTGCGAA AAATCTTCTG 1921GGTGAATCGG ATTTAGAGGT TTCACCAAAT AATTTATTAG GCCACTCTTC AGCATCTGCT 1981TGTAAAGAGA ATATACAAAA TTCAACTGGT CTGCGAAAAG AAGGTGAACC TGATTTATTG 2041AAAAGTAGGG GGAACAGCAA AACTGAGCCA GCACAATTTT CCATTCCAGC AAAACCTGAT 2101AGGAAGCACA CAGATTCCTT GGAATATGCT GTACCTGATG CCACTTTGGC TAACGGAAAC 2161TCCGTCAGGC TTCATGCGGC AAGAGATTTT CATCCTATCA ATGAAATTCC AAAGATATCC 2221AAGCCTTCTT CCAGATGTTC ATTTGGAACT GAAAATGCAA GTGACACTAA AGATAAGGCT 2281CGAAGACTCC AAATGCCATC TGGACCATTG ATTGACATCA CTAATTACAT CGATACACAT 2341TCTGTTAATA CTGACTACCT GGCCGGTGAG AAGAGAAGAT TTGGGGGAAG AAACTCCATA 2401TCTCCCTTTA AACGTCCTCG TTCTTCCAGG TTCATCGCAC CTATCAACAT CAATAATCCA 2461TCCCCTTCTG GAGTATCCAA ACTACCTATT CAGATTAATC CCTGTCGAAC AAAGCTATCT 2521TCATGCTATC CTTTTCAACA TCAAAGAAAA TCGTGTGAAG AGTATTTTGG TGGTCCCCCA 2581TGCTTCAAAT ATTTGACAGA AGATGTAACA GATGAAGTGA AGCTCATGGA TGCAAAAAAG 2641GCTGAGAAGT ACAAGTTTAA AACAGATACT GGTGCAGAAG AATTTCAGAA GATGCTTCTT 2701GCCTGTGGTG CTTCATTGAC ATACACAACT AAAGAATGGG TCAGCAACCA CTACAAATGG 2761ATTGTTTGGA AGCTTGCTTC ATTGGAGAGA TGCTATCCAA CTAGAGCTGC TGGCAAATTC 2821TTAAAAGTTG GTAATGTTTT GGAAGAGCTG AAGTACAGGT ATGACAGAGA AGTGAACAAT 2881GGCCACCGCT CAGCCATAAA GAAAATTTTG GAAGGGAATG CTTCACCATC TTTGATGATG 2941GTGCTGTGCA TTTCTGCTAT TTACTCTTGT CCTGACCTAA ACAACAGTAA GCCAGAGGAT 3001GATAGGGCAC ATACAGACGA CGACAACAGT GAGAATAAAA GCTTGAGACC TGCTAAAAGG 3061AACATGTCTA CAAAGATTGA ACTAACTGAT GGATGGTATT CTCTAGATGC GTCATTAGAT 3121CTGGCACTTT TGGAGCAACT AGAGAAAAGA AAACTTTTTA TAGGACAGAA GCTTCGGATA 3181TGGGGAGCTT CACTATGTGG GTGGGCTGGG CCTGTGTCAT TTCATGAGGC ATCGGGTACC 3241GTCAAATTAA TGATCCACAT AAATGGCACC TATCGTGCAA GATGGGATGA GACTTTGGGG 3301TTATGCAAGC ATGCTGGAGT CCCACTGGCA TTCAAGTGCA TAAAAGCTTC AGGTGGCAGA 3361GTTCCTAGGA CACTGGTTGG AGTTACAAGG ATTTATCCTG TTATGTACAG GGAGAGGTTT 3421TCTGACGGTC GTTTTGTGGT GAGGTCTGAA AGGATGGAAA GAAAAGCACT ACAGCTGTAT 3481CACCAGAGAG TGTCTAAGAT TGCAGAAGAC ATTCAGTCAG AACATGGAGA ACACTGCGAC 3541AACACTGATG ATAACGATGA AGGGGCAAAA ATATGCAAAA TGCTAGAGAG GGCAGCTGAG 3601CCTGAAATTC TTATGTCCAG CATGAGTTCA GAGCAGCTGC TGTCTTTCTC ATATTATCAA 3661GAAAAGCAAA AGATTGTCAG GCAAAATGAA GTAGCTAAGA AGGTTGAAAA TGCTCTTAAA 3721GTTGCTGGGC TTAGTTCAAG AGATGTTACA CCATTTTTGA AAGTGAGGGT GACGGGCCTT 3781ATCAGCAAAC ACTCCGCCAC AAAATCTGGC TGCAGGGAAG GGTTAATAAC AATTTGGAAC 3841CCTACCGAGA AGCAAAAATC CGACCTGGTG GAGGGACAAA TTTATTCTGT CACAGGACTG 3901TTGGCTTCAA GCTACTTTAC AGAAGTATCC TACTTGAGTG GTAGAGGATC ATCTACAGCA 3961TGGACGCCTT TAGCAACCGC ACAGACTACA AATTTTGAAC CATTTTTCAC CCCTCGTAAA 4021GCAGTTGAAT TGTCACATTT TGGTGAAGTG CCACTTACAA GCGAATTTGA CATTGCAGGT 4081GTTATTTTGT ATGTTGGGAA TGTTTATTTA TTGAACAACC AGAATAGGCA GTGGCTCTTT 4141TTGACAGATG GATCTAAATT TATCTCTGGA GAAAAGTATG AAGAGCAAGA TGACTGTCTT 4201CTGGCAGTTA GCTTTTCTTC CAAAACCACT GGCGAGGATT CTGCATTCTT CAATTATGCC 4261CTTTCTGGAC ATATAGTTGG TTTTAGTAAT CTGGTCAAGC GAGATAAAGA CCAGATGAGG 4321CACGTGTGGG TAGCTGAGGC GACAGAGAGC TCCACCTATA GTCTCTCCCA CGAGATACCT 4381AAAAAATCAC ATCTCAAAGA GGCTGCCACT TCTGCTGAAA AATGGGCTTC AAATTCTCAT 4441CCTATGATTC AGCATCTGAA GGAAAGAGTT CTGCAAATAG TTGGTGACAG TGGTGGCTGA //SEQ ID NO: 111: OsSPO11A cDNA from Oryza sativa (cultivar Kaybonnet)DEFINITION OsSPO11A cDNA 1329 bp ORGANISM Oryza sativa cv KaybonnetSOURCE Young flower AUTHORS Qiudeng Que CDS 1 . . . 1329 BASE COUNT225 a 460 c 405 g 239 t ORIGIN    1ATGTCGGAGA AGAAGCGCCG CGGCGGGGCA GGCGCGGGGG CCGCGTCGGG CTCCGCCTCC   61AAGAAGCCGC GGGTCTCCAC GGCGGCGTCG TACGCCGAGT CGCTCCGCTC GAAGCTCCGC  121CCCGACGCCT CCATCCTCGC CACCCTCCGC TCCCTGGCCT CCGCCTGCTC CAAACCCAAG  181CCCGCGGGGT CGTCGTCGTC GTCGTCGTCC GCCTCGAAGG CGCTCGCAGC CGAGGACGAC  241CCGGCCGCCA GCTACATCGT GGTGGCCGAC CAGGACTCCG CCTCCGTCAC CTCCCGCATC  301AACCGCCTCG TGCTCGCCGC GGCGCGCAGC ATCCTGTCCG GCCGGGGCTT CTCCTTCGCG  361GTGCCCTCCC GCGCCGCCTC CAACCAGGTC TACCTCCCGG ACCTCGACCG CATCGTGCTC  421GTCCGCCGCG AGTCCGCCAG GCCCTTCGCC AACGTCGCCA CCGCGCGGAA GGCCACCATC  481ACCGCGCGCG TCCTCTCCTT GGTCCACGCC GTCCTCCGCA GGGGGATCCA CGTCACCAAG  541CGTGACCTCT TCTACACCGA CGTCAAGCTC TTCGGCGACC AGGCGCAGTC CGACGCCGTC  601CTCGACGACG TCTCCTGTAT GCTCGGCTGC ACCCGCTCCT CCCTCCACGT CGTCGCGTCC  661GAGAAGGGCG TCGTCGTCGG GCGCCTCACC TTCGCCGACG ACGGCGACCG GATCGACTGC  721ACGCGCATGG GCGTCGGCGG GAAGGCCATC CCGCCCAACA TCGACAGGGT CTCAGGCATC  781GAGAGCGACG CTCTCTTCAT CTTGCTGGTG GAGAAGGACG CCGCGTTCAT GCGTCTCGCC  841GAGGACCGGT TCTACAACCG CTTCCCGTGC ATCATCTTGA CGGCGAAGGG GCAGCCGGAT  901GTCGCCACAC GGCTGTTCTT GCGGCGGCTT AAGGTGGAGC TGAAGCTGCC AGTGCTGGCA  961TTGGTGGACT CCGACCCATA TGGGCTGAAG ATCTTGTCAG TGTACATGTG TGGTTCCAAG 1021AACATGTCAT ATGACAGTGC CAACCTGACA ACACCGGATA TCAAGTGGCT CGGAGTGCGG 1081CCAAGCGATC TGGACAAGTA TCGGGTGCCG GAGCAGTGCC GGCTTCCGAT GACTGATCAC 1141GATATCAAGG TGGGGAAGGA GCTGCTTGAG GAGGACTTTG TGAAGCAGAA TGAAGGATGG 1201GTGAAGGAGC TGGAGACGAT GTTGCGGACG AGGCAGAAGG CTGAGATACA GGCTCTCAGT 1261TCATTTGGTT TCCAGTATCT CACTGAGGTC TATCTACCTC TCAAGCTGCA GCAACAGGAC 1321TGGATTTGA // SEQ ID NO: 112:OsSPO11B gDNA from Oryza sativa (cultivar Kaybonnet) DEFINITIONOsSpollB gDNA 1456 bp ORGANISM Oryza sativa cv Kaybonnet SOURCEOryza sativa cv. Kaybonnet calli REFERENCE 1 (bases 1 to 1456) CDS1 . . . 1444 BASE COUNT 452 a 268 c 326 g 410 t ORIGIN    1AGCAACCATG GATGATTCAA CGGATGACGA TTCGTATCAT CCAAGAAAAC ACTATGCTTA   61TGATCGTCAG GTTTCTTCAA GCAGATGGCG TACCAGCCGC GAGTATATCA GAGGTCCCGG  121CCCCGAAACT CATACTACTG AGAGTGCTCA AGATGGACAG GATCCACCTG CTGGAGTATA  181TTCCTATGGT TATTTTTCTG GCAGTGGTAA TGATCCTCAA GTTCAAGGAC ACTTTGTTCC  241GGAGATTCAA AAGTACAACC CTTACGTGAT TTTCAAAGGT GAACAACTCC CGGTTCCTAT  301ATGGGAACTG CCAGAGGAGA AGGTCCAAGA TTTTCATGAT AGGTACTTTA TTGCAAAAGA  361CAAGAGTCGA GTTGAAGCCA GGAAGACTCT GAATAGGTTG TTAGAGGGGA ACATCAATAC  421AATTGAAAGG GGACATGGAT ATAAATTCAA TATTCCAAAA TATACAGATA ACATGGAGTT  481TAATGAGGAA GTCAAGGTTT CTCTAGCAAA AGCAGGCAAG ACCATAAGCC GTTCCTTTTG  541CAATGCGAAT CAGCGGGAAG TTGCATCTAG GACTGGCTAT ACCATTGATC TAATAGAACG  601GACACTTGGG GCTGGATTGA ACATCTCGAA GAGAACTGTC TTATACACAA ACAAGGATCT  661GTTTGGGGAT CAAAGTAAAT CAGATCAAGC GATCAATGAC ATCTGCGCTT TGACAAATAT  721CAGAAGGGGC TCTTTGGGTA TAATAGCAGC TGAAAAAGGA ATTGTAGTTG GAAACATTTT  781CCTGGAATTG ACAAATGGCA AATCGATTAG TTGTTCTATT GGAGTGCAGA TACCACACAG  841GCTTGACCAG ATCAAAGATG TTTGTGTTGA AATAGGTTCA CGCAACATAG AGTATATTCT  901TGTTGTGGAA AAGCATACAA TGTTGAATTA TCTACTAGAG ATGGACTATC ACACCAATAA  961CAACTGTATA ATTCTGACAG GATGTGGCAT GCCAACCCTC CAAACAAGGG ATTTCCTCAG 1021ATTCTTGAAA CAACGCACTG GACTACCTGT CTTTGGACTT TGTGATCCAG ATCCTGAAGG 1081TATAAGTATT CTTGCTACGT ATGCTAGAGG GTCTTGCAAT TCAGCATATG ACAATTTCAA 1141TATTTCCGTG CCATCTATTT GTTGGGTTGG ATTGTCATCC TCAGACATGA TAAAGTTGAA 1201TTTGTCTGAG ACCAACTACT CACGTTTGTC TCGCGAGGAC AAAACTATGT TGAAGAACCT 1261TTGGCAGGAC GATTTGTCCG ATGTATGGAA ACGCAGAATC GAAGAAATGA TAAGTTTTGA 1321CAAGAAGGCC TCTTTTGAAG CTATTCATAG TTTGGGGTTT GATTATTTTG CAACCAATTT 1381GCTTCCGGAT ATGATTAACA AAGTACGAGA AGGCTATGTT CAGGTATATT TCTCACTCCT 1441ATAGCAACTT GTATTT // SEQ ID NO: 113: OsRad54A protein sequence LOCUSOsRad54A protein 1187 amino acid residues ORGANISMRice, Oryza sativa cv KaybonnetMEDDDDDQRLLHSLGVTSADIHDIERRIISQATTDPADSSGPTINGGHQPDDALAKLHHKLRSVQIEIDAVASTIKGAKLKQPSGNKPHEHKGKDQPDHHGAGHLQQALAADRLTSLRKAKAQIQKEILQSHLSPSASNRKDKMLAMLVQDEPRHKKPPVGPKNIVKRPMKTVTYDDDNNFDAVLDGASAGFMETEREELIRKGLLTPFHKLKGFEKRVELPEPSHRQDDSAGQTEEAMEASRIARVAQSLKQIAQNRPATKLLDSESLPKLDAPAAPFQRLGKPLKRPVSPSSDEQEKKRPRNKTKRPLPGKKWRKANSIKESSLDDNDVGEAAVSVSDDDEDQVTEGSDELTDVTLEGGLRIPGTLYTQLFDYQKVGVQWLWELHCQRAGGIIGDEMGLGKTVQVLSFLGSLHNSGLYKPSIVVCPVTLLQQWRREASRWYPKFKVEILHDSANSSSKKSKRSSDSDSEASWDSDQEEAVTCSKPAKKWDDLISRVVSSGSGLLLTTYEQLRILGEKLLDIEWGYAVLDEGHRIRNPNAEITLVCKQLQTVHRIIMTGAPIQNKLSELWSLFDFVFPGKLGVLPVFEAEFSVPITVGGYANATPLQVSTAYRCAVVLRDLVMPYLLRRMKADVNAQLPKKTEHVLFCSLTTEQRATYRAFLASSEVEQIFDGNRNSLYGIDVLRKICNHPDLLEREHAAQNPDYGNPERSGKMKVVEQVLKVWKEQGHRVLLFTQTQQMLDIMGNFLTACEYQYRRMDGLTPAKQRMALIDEFNNTDEIFIFILTTKVGGLGINLTGANRIIIYDPDWNPSTDMQARERAWRIGQTRDVTVYRLITRGTIEEKVYHRQVYKHFLTNKVLKDPQQRRFFKARDMKDLFTLQDDDNNGSTETSNIFSQLSEDVNIGVPSDKQQDQLYAASATPTTSGTEPSSSRHGQGKEDHCPDQADEECNILKSLFDAQGIHSAINHDAIMNANDDQKLRLEAEATQVAQRAAEALRQSRMLRSHESFSVPTWTGRAGAAGAPSSVRRKFGSTLNTQLVNSSQPSETSNGRGQSLQVGALNGKALSSAELLARIRGTREGAASDALEHQLNLGSASNHTSSSSGNGRASSSSTRSMIVQPEVLIRQLCTFIQQHGGSASSTSITEHFKNRILSKDMLLFKNLLKEIATLQRGANGATWVLKPDYQ // SEQ ID NO: 114:OsRad54B Protein sequence DEFINITIONOsRad54B protein 1148 amino acid residues ORGANISMOryza sativa cv KaybonnetMRTSTTSDSPSPSPQNKASFNTSRGAAFRDEEPGAKDNEVEKRKPLILHLKKRSTKELSTDTTSSKSGLLGKSSEEKQEKHGSALKVKKHLHPMELSPKKYKNKKQHNHRDSKRSEAKKVQYLASDVDSDSSMEPSTSLEHSESPPPKRKSLDGRTPASSTKKGKKKVKFIDKKHPENAVHITEKEHGGAGDKITTQGDLQVDRILGCRLQTSQIISPAHASSEQIDMAPPSASGATEPSQALSKGLHEEIQSSNSDTNVTEDACADELANDGGENNLDCSDAQKESNVRSHGHKESLNAKEIMNTASACSADQIVTVKDAGAVQTYVTASVNGEYETVTDIPEEKNDTKHPVSKADTEVHTKQEHTPDSKLHGKLENYKAKYGTGLINICKEQWCQPQRVIALRTSLDEIEEALIKWCALPYDECTWERLDEPTMVKYAHLVTQFKKFESQALDKDKGGSHAKPREHQEFNMLVEQPKELQGGMLFPHQLEALNWLRKCWYKSKNVILADEMGLGKTVSACAFLSSLCCEYKINLPCLVLVPLSTMPNWMAEFASWAPHLNVVEYHGSARARSIIRQYEWHEGDASQMGKIKKSHKFNVLLTTYEMVLVDAAYLRSVSWEVLIVDEGHRLKNSSSKLFSLLNTLSFQHRVLLTGTPLQNNIGEMYNLLNFLQPASFPSLASFEEKFNDLTTTEKVEELKNLVAPHMLRRLKKDAMQNIPPKTERMVPVELTSIQAEYYRAMLTKNYQVLRNIGKGGAHQSLLNIVMQLRKVCNHPYLIPGTEPESGSPEFLHEMRIKASAKLTLLHSMLKILHKDGHRVLIFSQMTKLLDILEDYLTWEFGPKTFERVDGSVSVAERQAAIARFNQDKSRFVFLLSTRSCGLGINLATADTVIIYDSDFNPHADIQAMNRAHRIGQSNRLLVYRLVVRASVEERILHLAKKKLMLDQLFVNKSESQKEVEDIIRWGTEELFRNSDVAVKDNNEASGAKNDVAEVEFKHKRKTGGLGDVYEDRCADGSAKFNWDENAITKLLDRSNVPSTVAESTDGDLDNDMLGTVKSIDWNDELNDDPGATEDIPNIDNDGCEQASEAKQDAANRVEENEWDKLLRVRWEQYQTEEEASLGRGKRLRKAVSYRETFAT IPNEALSE //SEQ ID NO: 115: OsBRCA1 protein sequence DEFINITIONOsBRCA1 protein 987 amino acid residues ORGANISMOryza sativa cv KaybonnetMADTGSLEKMGRELKCPICLSLLSSAVSISCNHVFCNDCLTESMKSTSSCPVCKVPFRRREMRPAPHMDNLVSIFKSMEAAAGTNVVSTQEAPVVKLADGSDCVNSGKNSKRSQKSLTRKRKVTSEMEKNTAKDATASASQPTTKPSFSTNKRIQVKPFPESETPIRAEKIMKPEEPKNNLNNDVEGKNKAVASGQPGSPSLSPFFWLREQEEQEGCTAETLSETQSLDTPLRHNAPSFSDIKDSDDEIPLNTTPNSKAAATELFDSEIFEWTQRPCSPELYSTPLKKQSKAKSKLDQIEEKGDEEDVHIGGSFDKLGSASNAAQLVNTKATKQKRKKTSPSNKNSAKLSNRAEPCIKKSDANQQGSNRRKSAALKSCQKSSSAVGRNTSGRRNKASSNSKPIHGSSDNSPESYLPKEGLDVEAPDKPLSERIQNLEKTSRRKGSARKLEMAGKTISDTTEKNSEPRSKRVRRMSDHAIAKPVEVPSGSGNETEIPQLHTLTKGSIQRKSSNARRHSKVCGEQEGKNKLENTTMTPIILHGKCQNKEAVCTAPSVRTASVKYKQAKFSEQPDCFGTENFGNLQACPARNVLLKKCEVSTLKVSCAFCQTDVITEESGEMVHYQNGKQVPAEFNGGANVVHSHKNCLEWAPDVYFEDDSAFNLTTELARSRRIKCACCGIKGAALGCFEMSCRRSFHFTCAKLIPECRWDNENFVMLCPLHRSTKLPNENSEQQKQPKRKTTLKGSSQIGSNQDCGNNWKWPSGSPQKWVLCCSSLSSSEKGLVSEFAKLAGVPISATWSPNVTHVIASTDLSGACKRTLKFLMAILNGRWIVSIDWVKTCMECMEPIDEHKFEVATDVHGITDGPRLGRCRVIDRQPKLFDSMRFYLHGDYTKSYRGYLQDLVVAAGGIVLQRKPVSRDQQKLLDDSSDLLIVYSFENQDRAKSKAETKAADRRQADAQALACASGGRVVSSAWVIDSIAACNLQPL // SEQ ID NO: 116: OsBRCA2 Protein sequenceDEFINITION OsBRCA2 protein 1499 amino acid resisues ORGANISMOryza sativa cv KaybonnetMADLFNQALDKLVAADGMAEAIEDSGKGAVFCTGLGGSVAVSERAVERAKALVGEVAEEISNERRQPFGDGSNLECGLGESNVSFKGGVHKDSLSPMFQTGSGKMVSLSKGSIQKARAVLEGNAENSSVIAVQSMFHTGLVRPDPVSRSSTDNAMTVLEGQTNPKQGDVADVYDKENFPLFQTGSGKAVSVSVASIQKAKAVLEQNNTENTEDFGRPDQSLIFQTGSRRPVLISERSSSVVKDGGAENIVFQTGLGRPVVVSQTSIQKARTVLDQECAKRSGHGDTNVSTTTFQTETPTPVLMSGGLTMNDRSVTPEGGVSMQGNFLEADGHLPLFQTGLGRSISVSKGSIKRASALLEPRNITKELEDEAHSDDGCATPMFKTGSGRSITASENSRKKAHVVLEGEEPVKNVNNDTGEAIAPMLHAGMQKFAPQNRNSSHKAITLMEQGSSMEEDRGNEPPMFRTGSGKSVLISHSSVQKARAVLEEEGNMKKENHKQLSNVDKYIPIFTSPLKTSYARTVHISSVGVSRAATLLGLEENTLSTQLLGHVGDKLGTKITVERENSEHQFGVASVSGISGGCPISSGPAENQVLMDPHQHFAFSKTTFSDSSEQAIRFSTAGGRTMAIPSDALQRAKNLLGESDLEVSPNNLLGHSSASACKENIQNSTGLRKEGEPDLLKSRGNSKTEPAQFSIPAKPDRKHTDSLEYAVPDATLANGNSVRLHAARDFHPINEIPKISKPSSRCSFGTENASDTKDKARRLQMPSGPLIDITNYIDTHSVNTDYLAGEKRRFGGRNSISPFKRPRSSRFIAPININNPSPSGVSKLPIQINPCRTKLSSCYPFQHQRKSCEEYFGGPPCFKYLTEDVTDEVKLMDAKKAEKYKFKTDTGAEEFQKMLLACGASLTYTTKEWVSNHYKWIVWKLASLERCYPTRAAGKFLKVGNVLEELKYRYDREVNNGHRSAIKKILEGNASPSLMMVLCISAIYSCPDLNNSKPEDDRAHTDDDNSENKSLRPAKRNMSTKIELTDGWYSLDASLDLALLEQLEKRKLFIGQKLRIWGASLCGWAGPVSFHEASGTVKLMIHINGTYRARWDETLGLCKHAGVPLAFKCIKASGGRVPRTLVGVTRIYPVMYRERFSDGRFVVRSERMERKALQLYHQRVSKIAEDIQSEHGEHCDNTDDNDEGAKICKMLERAAEPEILMSSMSSEQLLSFSYYQEKQKIVRQNEVAKKVENALKVAGLSSRDVTPFLKVRVTGLISKHSATKSGCREGLITIWNPTEKQKSDLVEGQIYSVTGLLASSYFTEVSYLSGRGSSTAWTPLATAQTTNEEPFFTPRKAVELSHFGEVPLTSEFDIAGVILYVGNVYLLNNQNRQWLFLTDGSKFISGEKYEEQDDCLLAVSFSSKTTGEDSAFFNYALSGHIVGESNLVKRDKDQMRHVWVAEATESSTYSLSHEIPKKSHLKEAATSAEKWASNSHPMIQHLKERVLQIVGDSGG //SEQ ID NO: 117: OsSPO11A protein DEFINITION OsSPO1lA protein, 442 amino acids ORGANISM Oryza sativa cv KaybonnetMSEKKRRGGAGAGAASGSASKKPRVSTAASYAESLRSKLRPDASILATLRSLASACSLPKPAGSSSSSSSASLALAAEDDPAASYIVVADQDSASVTSRINRLVLAAARSILSGRGFSFAVPSRAASNQVYLPDLDRIVLVRRESARPFANVATARLATITARVLSLVHAVLRRGIHVTKRDLFYTDVKLFGDQAQSDAVLDDVSCMLGCTRSSLHVVASEKGVVVGRLTFADDGDRIDCTRMGVGGKAIPPNIDRVSGIESDALFILLVEKDAAFMRLAEDRFYNRFPCIILTAKGQPDVATRLFLRRLKVELKLPVLALVDSDPYGLKILSVYMCGSKNMSYDSANLTTPDIKWLGVRPSDLDKYRVPEQCRLPMTDHDIKVGKELLEEDFVKQNEGWVKELETMLRTRQKAEIQALSSFGFQYLTEVYLPLKLQQQDWI // SEQ ID NO: 118: OsSPO11B DEFINITIONOsSpo11B protein, 478 amino acid residues ORGANISMOryza sativa cv KaybonnetMDDSTDDDSYHPRKHYAYDRQVSSSRWRTSREYIRGPGPETHTTESAQDGQDPPAGVYSYGYFSGSGNDPQVQGHFVPEIQKYNPYVIFKGEQLPVPIWELPEEKVQDFHDRYFIAKDKSRVEARKTLNRLLEGNINTIERGHGYKFNIPKYTDNMEFNEEVKVSLAKAGKTISRSFCNANQREVASRTGYTIDLIERTLGAGLNISKRTVLYTNKDLFGDQSKSDQAINDICALTNIRRGSLGIIAAEKGIVVGNIFLELTNGKSISCSIGVQIPHRLDQIKDVCVEIGSRNIEYILVVEKHTMLNYLLEMDYHTNNNCIILTGCGMPTLQTRDFLRFLKQRTGLPVFGLCDPDPEGISILATYARGSCNSAYDNFNISVPSICWVGLSSSDMIKLNLSETNYSRLSREDKTMLKNLWQDDLSDVWKRRIEEMISFDKKASFEAIHSLGFDYFATNLLPDMINKVREGYVQVYFSLL //

All publications, published patent documents, and patent applicationscited in this specification are indicative of the level of skill in theart(s) to which the invention pertains. All publications, publishedpatent documents, and patent applications cited herein are herebyincorporated by reference to the same extent as though each individualpublication, published patent document, or patent application wasspecifically and individually indicated as being incorporated byreference.

The foregoing describes the invention with reference to variousembodiments and examples. No particular embodiment, example, or elementof a particular embodiment or example is to be construed as a critical,required, or essential element or feature of any or all of the claims.As used herein, the terms “comprises,” “comprising,” “includes,”“including,” “contains,” “containing,” and any variations thereof, areintended to cover a non-exclusive inclusion, such that a process,method, product-by-process, or composition of matter that comprises,includes, or contains an element or list of elements does not includeonly those elements but may include other elements not expressly listedor inherent to such process, method, product-by-process, or compositionof matter. Further, no element described herein is required for thepractice of the invention unless expressly described as “essential” or“critical.”

It will be appreciated that various modifications and substitutions canbe made to the disclosed embodiments without departing from the scope ofthe invention as set forth in the claims below. The specification,including the drawings and examples, is to be regarded in anillustrative manner, rather than a restrictive one, and all suchmodifications and substitutions are intended to be included within thescope of the invention. Accordingly, the scope of the invention shouldbe determined by the appended claims and their legal equivalents, ratherthen by the examples given above. For example, the steps recited in anyof the method or process claims may be executed in any feasible orderand are not limited to an order presented in any of claims.

1-14. (canceled)
 15. A method of preparing a host cell having a genomewith a target site for reiterative gene stacking in the host cellgenome, the method comprising the steps of: (a) providing a host cellhaving a genome comprising a target sequence comprising: (i) a truncatedfunctional sequence; and (ii) a host homology sequence; (b) introducinginto the host cell a donor sequence comprising: (i) a heterologoussequence of interest or fragment thereof; (ii) a donor homology sequencehomologous to the host homology sequence; (iii) a sequence thatcompletes the truncated functional sequence located between theheterologous sequence of interest or fragment thereof and the donorhomology sequence; (iv) a first recombinase recognition site locatedbetween the donor homology sequence and the sequence that completes thetruncated functional sequence; and (v) a second recombinase recognitionsite located between the sequence that completes the truncatedfunctional sequence and the heterologous sequence of interest orfragment thereof; and further wherein the first and second recombinaserecognition sites are oriented relative to one another such that thesequence that completes the truncated functional sequence is excisablein the presence of a recombinase; (c) obtaining in the host cell arecombination product comprising: (i) the heterologous sequence ofinterest; (ii) a recombined sequence resulting from homologousrecombination of the host homology sequence and the donor homologysequence; (iii) a restored functional sequence comprising the truncatedfunctional sequence, the sequence that completes the truncatedfunctional sequence, the recombined sequence located between thetruncated functional sequence and the sequence that completes thetruncated functional sequence, and the first recombinase recognitionsite located between the sequence that completes the truncatedfunctional sequence and the recombined sequence; and (iv) the secondrecombinase recognition site located between the heterologous sequenceof interest and the restored functional sequence.
 16. The method ofclaim 15, wherein the restored functional sequence encodes a marker. 17.The method of claim 16, wherein the marker is selected from the groupconsisting of NPTII, HPT, PAT, BAR, EPSPS, GAT, HPPD, ALS, PPO, PMI,GUS, LUC, GFP, RFP, and CFP.
 18. A host cell produced by the method ofclaim
 15. 19. The host cell of claim 18, which is a plant cell.
 20. Aplant or plant part comprising the plant cell of claim
 19. 21. Themethod of claim 15, wherein the target sequence of step (a) furthercomprises (iii) a second host homology sequence, and wherein the hosthomology sequence of (a)(ii) is located between the second host homologysequence and the truncated functional sequence; and wherein the donorsequence of step (b) further comprises (vi) a second donor homologysequence having homology to the second host homology sequence, andwherein the heterologous sequence of interest is located between thesecond donor homology sequence and the second recombinase recognitionsite; and wherein the recombination product of step (c) furthercomprises (v) a second recombined sequence resulting from homologousrecombination of the second host homology sequence and the second donorhomology sequence, and wherein the heterologous sequence of interest ofstep (c)(1) is located between the second recombined sequence and thesecond recombinase recognition site.
 22. A host cell produced by themethod of claim
 21. 23. The host cell of claim 22, which is a plantcell.
 24. A plant or plant part comprising the plant cell of claim 23.25. The method of claim 15, further comprising: (d) introducing into thehost cell of step (c) a recombinase or recombinase coding sequence tothereby excise the sequence that completes the truncated functionalsequence and yield an excision product comprising: (i) the heterologoussequence of interest; (ii) the truncated functional sequence; (iii) therecombined sequence located between the heterologous sequence ofinterest and the truncated functional sequence; and (iv) a regeneratedtarget recombinase recognition site located between the heterologoussequence of interest and the recombined sequence.